Language Model API Overview

Download

フォーカスモード

フォントサイズ

最終更新日: 2026-07-15 14:40:51

Overview
The TokenHub platform aggregates language models from multiple providers, including DeepSeek, Zhipu GLM, Kimi, and MiniMax, covering scenarios such as conversational interaction, content creation, code generation, and reasoning analysis. All models uniformly support two protocols: the OpenAI Chat Completions API and the Anthropic Messages API. You can directly use the OpenAI SDK, Anthropic SDK, or any compatible client to connect.
Supported Protocols by Model
Model Name
Model (API Parameter)
OpenAI Chat Completions
OpenAI Responses
Anthropic
DeepSeek-V4-Flash (Vendor Direct)
deepseek-v4-flash-202605
✅
Compatible*
✅
DeepSeek-V4-Pro (Vendor Direct)
deepseek-v4-pro-202606
✅
Compatible*
✅
DeepSeek-V4-Flash
deepseek-v4-flash
✅
Compatible*
✅
DeepSeek-V4-Pro
deepseek-v4-pro
✅
Compatible*
✅
Deepseek-v3.2
deepseek-v3.2
✅
❌
✅
GLM-5.2
glm-5.2
✅
Compatible*
✅
GLM-5.1
glm-5.1
✅
❌
✅
GLM-5V-Turbo
glm-5v-turbo
✅
❌
✅
GLM-5-Turbo
glm-5-turbo
✅
❌
✅
GLM-5
glm-5
✅
❌
✅
Kimi K2.7 Code
kimi-k2.7-code
✅
❌
✅
Kimi-K2.6
kimi-k2.6
✅
❌
✅
Kimi-K2.5
kimi-k2.5
✅
❌
✅
MiniMax-M3
minimax-m3
✅
✅
✅
MiniMax-M2.7
minimax-m2.7
✅
✅
✅
MiniMax-M2.5
minimax-m2.5
✅
✅
✅
Hy-MT2-Plus
hy-mt2-plus
✅
❌
✅
Note:
Models marked as "Compatible" are supported via protocol translation (converting Responses protocol to Chat protocol) handled by the gateway, refer to the Responses API Compatibility Mode.
Using the OpenAI API
Base URL
Guangzhou: https://tokenhub.tencentcloudmaas.com/v1
Singapore: https://tokenhub-intl.tencentcloudmaas.com/v1
Request Parameters
The following table lists the common request parameters supported by the TokenHub gateway. For complete field definitions and the latest updates, see the OpenAI API official documentation.
Parameter Name
Required
Type
Description
model
Yes
String
Service ID.
For platform-provided services, the service ID is the same as the model name.(for example, deepseek-v3.2). For the complete list, see the Supported Protocols by Model column in Model-Supported Protocol Overview.
For user-created custom services, the service ID follows the format ep-xxxxxxxx and can be viewed on the online inference service page.
messages
Yes
Array
An array of chat context messages. For details, see the messages parameter.
stream
No
Boolean
Whether to enable streaming output.
Valid values: true / false. The default value is false.
stream_options
No
Object
Streaming output options. A common setting: {"include_usage": true} causes the last chunk to carry the usage statistics field (effective only when stream=true).
temperature
No
Float
Sampling temperature, which controls output randomness.
Valid values: [0.0, 2.0]. The default value is 1.0. A higher value results in more random output.
Some models have specific value constraints. See the dedicated documentation for the corresponding model.
top_p
No
Float
Probability threshold for Nucleus Sampling.
Valid values: [0.0, 1.0]. The default value is 1.0. It is recommended to use it as an alternative to temperature.
max_tokens
No
Integer
Limits the maximum number of output tokens per response. Reasoning tokens and response tokens of thinking models share this quota. It is recommended to increase it appropriately.
n
No
Integer
The number of candidate responses generated for a single request. The default value is 1.
Note: Billing is based on the total number of tokens when n > 1.
stop
No
String or Array of String
Specifies the stop sequences for model output. When the generated result matches any of the specified sequences, the model stops outputting, and the response content does not include that stop sequence. It supports passing a single string or an array of strings, with a maximum of 4.
For example, to have the model generate a list of 10 items and prevent it from continuing to write the 11th item, you can fill in this field with: ["11."].
seed
No
Integer
Random seed, used for result reproducibility. When the same seed value is used across multiple requests and other parameters remain consistent, the model is more likely to return identical or very similar results.
frequency_penalty
No
Float
Frequency penalty. Valid values range from -2.0 to 2.0. The default value is 0. A positive value reduces the probability of tokens that have already appeared frequently being selected again, which can help mitigate repetitive content.
presence_penalty
No
Float
Presence penalty. Valid values range from -2.0 to 2.0. The default value is 0. A positive value encourages the model to discuss new topics (based only on whether a Token has appeared, not on its frequency).
logit_bias
No
Map
Modifies the probability of specific tokens appearing in the result. The key is the token ID, and the value is a bias within the range of -100 to 100. A value of -100 disables the token, and a value of 100 forces its use.
logprobs
No
Boolean
Whether to return the log probabilities of output tokens. The default value is false.
top_logprobs
No
Integer
Returns the N tokens with the highest probability at each position. Valid values range from 0 to 20. logprobs=true must be set concurrently.
response_format
No
Object
Specifies the response output format. Common values:
{"type": "text"}: Text output by default.
{"type": "json_object"}: JSON mode, which forces the output of valid JSON.
{"type": "json_schema", "json_schema": {...}}: Structured output, constrained by the specified Schema.
tools
No
Array
A list of Function Calling tool definitions. Each tool contains a type: "function" and a function object (which includes name / description / parameters).
tool_choice
No
String or Object
Tool invocation policy:
"none": Tool calls are prohibited.
"auto": Automatically determines whether to call (default).
"required": Forces the call of any tool.
{"type": "function", "function": {"name": "xxx"}}: Forces the call of the specified tool.
parallel_tool_calls
No
Boolean
Whether to allow multiple tools to be invoked in parallel within a single response. The default value is true. Setting it to false forces tools to be invoked serially, which facilitates debugging.
thinking
No
Object
Controls the thinking mode. The default value varies across different models. For details, see Deep Thinking.
Valid values: {"type": "enabled"} / {"type": "disabled"}.
reasoning_effort
No
String
Controls the reasoning depth. It takes effect only on thinking models, and the default value varies across different models. For details, see Deep Thinking.
Valid values: low / medium / high.
user
No
String
A stable identifier for end users, facilitating auditing and troubleshooting.
Messages Parameter
Each object in the message array contains the following fields:
Field
Type
Description
role
String
Role: system (system prompt), user (user), assistant (assistant), tool (tool response)
content
String
Text content of the message.
Message Sequence Rule: [system(optional) → user → assistant → user → ...], and must end with the user role.
Response Parameters
Parameter Name
Type
Description
id
String
The unique identifier of the request.
object
String
The object type, fixed as chat.completion.
created
Integer
Creation time (Unix timestamp).
model
String
The name of the model actually used.
choices
Array
The list of candidate results returned by the model for a single request. For details, see choices array element.
usage
Object
Token consumption statistics.
Choices Array Elements
Field
Type
Description
index
Integer
Option index.
message
Object
Response message containing role and content
finish_reason
String
Reason for termination: stop (normal termination), length (maximum length reached), tool_calls (tool invocation required).
Usage Object
Field
Type
Description
prompt_tokens
Integer
Number of input tokens
completion_tokens
Integer
Number of output tokens
total_tokens
Integer
Total number of tokens (used for billing)
Sample Code
Note:
This document is the Common Invocation Guide for All Language Models. Different models may have slight variations in aspects such as thinking mode toggle, reasoning field returns, multimodal formats, and special parameter values. Please also refer to the dedicated documentation for the corresponding model:
DeepSeek Model: DeepSeek API Guide﻿
GLM Model: GLM API Guide
Kimi Model: Kimi API Guide
MiniMax Model: MiniMax API Guide
Example: Basic Conversation
Example: Streaming Output
Example: System Prompt
Example: Multi-turn Conversation
Example: Function Calling (Tool Calling)
Example: Basic Conversation
Note:
Replace YOUR_API_KEY with the API Key you created, and replace model with the service ID you want to try.
cURL
Python
Node.js
Java
Go
curl -X POST 'https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions' \\
  -H 'Authorization: Bearer YOUR_API_KEY' \\
  -H 'Content-Type: application/json' \\
  -d '{
    "model": "deepseek-v3.2",
    "messages": [
      {"role": "user", "content": "Hello, please introduce yourself"}
    ]
  }'
from openai import OpenAI
﻿
client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://tokenhub-intl.tencentcloudmaas.com/v1",
)
﻿
response = client.chat.completions.create(
    model="deepseek-v3.2",
    messages=[
        {"role": "user", "content": "Hello, please introduce yourself"},
    ],
)
print(response.choices[0].message.content)
import OpenAI from 'openai';
﻿
const client = new OpenAI({
  apiKey: 'YOUR_API_KEY',
  baseURL: 'https://tokenhub-intl.tencentcloudmaas.com/v1',
});
﻿
const response = await client.chat.completions.create({
  model: 'deepseek-v3.2',
  messages: [
    { role: 'user', content: 'Hello, please introduce yourself' },
  ],
});
console.log(response.choices[0].message.content);
// Using the OpenAI-compatible protocol, call the HTTP API directly with OkHttp
import okhttp3.*;
import com.google.gson.Gson;
import java.util.*;
﻿
public class BasicChat {
    public static void main(String[] args) throws Exception {
        Map<String, Object> body = new HashMap<>();
        body.put("model", "deepseek-v3.2");
        body.put("messages", Arrays.asList(
            Map.of("role", "user", "content", "Hello, please introduce yourself")
        ));
﻿
        RequestBody requestBody = RequestBody.create(
            new Gson().toJson(body),
            MediaType.parse("application/json")
        );
﻿
        Request request = new Request.Builder()
            .url("https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions")
            .header("Authorization", "Bearer YOUR_API_KEY")
            .post(requestBody)
            .build();
﻿
        try (Response response = new OkHttpClient().newCall(request).execute()) {
            System.out.println(response.body().string());
        }
    }
}
package main
﻿
import (
    "bytes"
    "encoding/json"
    "fmt"
    "io"
    "net/http"
)
﻿
func main() {
    body := map[string]interface{}{
        "model": "deepseek-v3.2",
        "messages": []map[string]string{
            {"role": "user", "content": "Hello, please introduce yourself"},
        },
    }
    payload, _ := json.Marshal(body)
﻿
    req, _ := http.NewRequest("POST",
        "https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions",
        bytes.NewBuffer(payload))
    req.Header.Set("Authorization", "Bearer YOUR_API_KEY")
    req.Header.Set("Content-Type", "application/json")
﻿
    resp, err := http.DefaultClient.Do(req)
    if err != nil {
        panic(err)
    }
    defer resp.Body.Close()
﻿
    data, _ := io.ReadAll(resp.Body)
    fmt.Println(string(data))
}
Response:
{
    "id": "5e9c7ae9-e0e4-4ec1-bbd0-22bcfda61e45",
    "object": "chat.completion",
    "model": "deepseek-v3.2",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "Hello! Nice to meet you! 😊\\n\\nI am DeepSeek, an AI assistant created by DeepSeek Company. Let me briefly introduce myself:\\n\\n**My Features:**\\n- 📚 My knowledge is up to date as of July 2024, and I am the latest version of the DeepSeek model.\\n- 💬 I am a pure text conversation model, focused on understanding and generating textual content.\\n- 📁 I support file uploads—I can process images, txt, pdf, ppt, word, excel, and other files, and read text information from them.\\n- 🌐 I support web search (you need to manually enable it in the Web/App).\\n- 💾 I have a 128K context length, allowing me to remember our longer conversations.\\n\\n**What I can do for you:**\\n- Answer various questions and engage in in-depth discussions.\\n- Assist with writing, translation, and analysis.\\n- Process uploaded document content.\\n- Provide suggestions for learning, work, and life.\\n\\n**Important Notes:**\\n- I am completely free to use, with no paid plans.\\n- I currently do not support voice features.\\n- You can download the App from official app stores.\\n\\nMy response style is warm and detailed, and I hope to provide you with a pleasant communication experience! If you have anything to talk about or need help with, just let me know! ✨"
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 10,
        "completion_tokens": 244,
        "total_tokens": 254,
        "prompt_tokens_details": {
            "cached_tokens": 0
        },
        "completion_tokens_details": {
            "reasoning_tokens": 0
        }
    }
}
Example: Streaming Output
Note:
Replace YOUR_API_KEY with the API Key you created, and replace model with the service ID you want to try.
cURL
Python
Node.js
Java
Go
curl -X POST 'https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions' \\
  -H 'Authorization: Bearer YOUR_API_KEY' \\
  -H 'Content-Type: application/json' \\
  -d '{
    "model": "deepseek-v3.2",
    "messages": [
      {"role": "system", "content": "You are a helpful AI assistant."},
      {"role": "user", "content": "Calculate 1+1"}
    ],
    "stream": true
  }'
from openai import OpenAI
﻿
client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://tokenhub-intl.tencentcloudmaas.com/v1",
)
﻿
stream = client.chat.completions.create(
    model="deepseek-v3.2",
    messages=[
        {"role": "system", "content": "You are a helpful AI assistant."},
        {"role": "user", "content": "Calculate 1+1"},
    ],
    stream=True,
)
for chunk in stream:
    if chunk.choices and chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)
import OpenAI from 'openai';
﻿
const client = new OpenAI({
  apiKey: 'YOUR_API_KEY',
  baseURL: 'https://tokenhub-intl.tencentcloudmaas.com/v1',
});
﻿
const stream = await client.chat.completions.create({
  model: 'deepseek-v3.2',
  messages: [
    { role: 'system', content: 'You are a helpful AI assistant.' },
    { role: 'user', content: 'Calculate 1+1' },
  ],
  stream: true,
});
﻿
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || '');
}
// For streaming calls based on SSE, use OkHttp to receive line-by-line responses.
import okhttp3.*;
import okhttp3.sse.*;
import com.google.gson.Gson;
import java.util.*;
﻿
public class Streaming {
    public static void main(String[] args) {
        Map<String, Object> body = new HashMap<>();
        body.put("model", "deepseek-v3.2");
        body.put("messages", Arrays.asList(
            Map.of("role", "system", "content", "You are a helpful AI assistant."),
            Map.of("role", "user", "content", "Calculate 1+1")
        ));
        body.put("stream", true);
﻿
        Request request = new Request.Builder()
            .url("https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions")
            .header("Authorization", "Bearer YOUR_API_KEY")
            .post(RequestBody.create(new Gson().toJson(body), MediaType.parse("application/json")))
            .build();
﻿
        EventSources.createFactory(new OkHttpClient()).newEventSource(request,
            new EventSourceListener() {
                @Override public void onEvent(EventSource es, String id, String type, String data) {
                    if (!"[DONE]".equals(data)) System.out.print(data);
                }
            });
    }
}
package main
﻿
import (
    "bufio"
    "bytes"
    "encoding/json"
    "fmt"
    "net/http"
    "strings"
)
﻿
func main() {
    body, _ := json.Marshal(map[string]interface{}{
        "model": "deepseek-v3.2",
        "messages": []map[string]string{
            {"role": "system", "content": "You are a helpful AI assistant."},
            {"role": "user", "content": "Calculate 1+1"},
        },
        "stream": true,
    })
﻿
    req, _ := http.NewRequest("POST",
        "https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions",
        bytes.NewBuffer(body))
    req.Header.Set("Authorization", "Bearer YOUR_API_KEY")
    req.Header.Set("Content-Type", "application/json")
﻿
    resp, _ := http.DefaultClient.Do(req)
    defer resp.Body.Close()
﻿
    scanner := bufio.NewScanner(resp.Body)
    for scanner.Scan() {
        line := scanner.Text()
        if strings.HasPrefix(line, "data: ") && line != "data: [DONE]" {
            fmt.Println(strings.TrimPrefix(line, "data: "))
        }
    }
}
Streaming responses use the Server-Sent Events SSE (Server-Sent Events) format:
data: {"id":"chatcmpl-abc123","choices":[{"index":0,"delta":{"role":"assistant","content":"1"},"finish_reason":null}]}
﻿
data: {"id":"chatcmpl-abc123","choices":[{"index":0,"delta":{"content":"+"},"finish_reason":null}]}
﻿
data: {"id":"chatcmpl-abc123","choices":[{"index":0,"delta":{"content":"1"},"finish_reason":null}]}
﻿
data: {"id":"chatcmpl-abc123","choices":[{"index":0,"delta":{"content":"="},"finish_reason":null}]}
﻿
data: {"id":"chatcmpl-abc123","choices":[{"index":0,"delta":{"content":"2"},"finish_reason":null}]}
﻿
data: {"id":"chatcmpl-abc123","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
﻿
data: [DONE]
Example: System Prompt
Note:
Replace YOUR_API_KEY with the API Key you created, and replace model with the service ID you want to try.
cURL
Python
Node.js
Java
Go
curl -X POST 'https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions' \\
  -H 'Authorization: Bearer YOUR_API_KEY' \\
  -H 'Content-Type: application/json' \\
  -d '{
    "model": "deepseek-v3.2",
    "messages": [
      {"role": "system", "content": "You are a professional English translation assistant. Translate user-input Chinese into English, and translate English into Chinese. Return only the translation result, without any explanation."},
      {"role": "user", "content": "The weather is really nice today."}
    ]
  }'
from openai import OpenAI
﻿
client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://tokenhub-intl.tencentcloudmaas.com/v1",
)
﻿
response = client.chat.completions.create(
    model="deepseek-v3.2",
    messages=[
        {"role": "system", "content": "You are a professional English translation assistant. Translate user-input Chinese into English, and translate English into Chinese. Return only the translation result, without any explanation."},
        {"role": "user", "content": "The weather is really nice today."}
    ],
)
print(response.choices[0].message.content)
import OpenAI from 'openai';
﻿
const client = new OpenAI({
  apiKey: 'YOUR_API_KEY',
  baseURL: 'https://tokenhub-intl.tencentcloudmaas.com/v1',
});
﻿
const response = await client.chat.completions.create({
  model: 'deepseek-v3.2',
  messages: [
    { role: 'system', content: 'You are a professional English translation assistant. Translate user-input Chinese into English, and translate English into Chinese. Return only the translation result, without any explanation.' },
    { role: 'user', content: 'The weather is really nice today.' },
  ],
});
console.log(response.choices[0].message.content);
import okhttp3.*;
import com.google.gson.Gson;
import java.util.*;
﻿
public class SystemPrompt {
    public static void main(String[] args) throws Exception {
        Map<String, Object> body = new HashMap<>();
        body.put("model", "deepseek-v3.2");
        body.put("messages", Arrays.asList(
            Map.of("role", "system", "content", "You are a professional English translation assistant. Translate user-input Chinese into English, and translate English into Chinese. Return only the translation result, without any explanation."),
            Map.of("role", "user", "content", "The weather is really nice today.")
        ));
﻿
        Request request = new Request.Builder()
            .url("https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions")
            .header("Authorization", "Bearer YOUR_API_KEY")
            .post(RequestBody.create(new Gson().toJson(body), MediaType.parse("application/json")))
            .build();
﻿
        try (Response response = new OkHttpClient().newCall(request).execute()) {
            System.out.println(response.body().string());
        }
    }
}
package main
﻿
import (
    "bytes"
    "encoding/json"
    "fmt"
    "io"
    "net/http"
)
﻿
func main() {
    body, _ := json.Marshal(map[string]interface{}{
        "model": "deepseek-v3.2",
        "messages": []map[string]string{
            {"role": "system", "content": "You are a professional English translation assistant. Translate user-input Chinese into English, and translate English into Chinese. Return only the translation result, without any explanation."},
            {"role": "user", "content": "The weather is really nice today."}
        },
    })
﻿
    req, _ := http.NewRequest("POST",
        "https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions",
        bytes.NewBuffer(body))
    req.Header.Set("Authorization", "Bearer YOUR_API_KEY")
    req.Header.Set("Content-Type", "application/json")
﻿
    resp, _ := http.DefaultClient.Do(req)
    defer resp.Body.Close()
    data, _ := io.ReadAll(resp.Body)
    fmt.Println(string(data))
}
Response:
{
    "id": "5d42fea3-413e-42ce-99b2-0d1595dae996",
    "object": "chat.completion",
    "model": "deepseek-v3.2",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "The weather is really nice today."
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 38,
        "completion_tokens": 7,
        "total_tokens": 45,
        "prompt_tokens_details": {
            "cached_tokens": 0
        },
        "completion_tokens_details": {
            "reasoning_tokens": 0
        }
    }
}
Example: Multi-Turn Conversation
Note:
Replace YOUR_API_KEY with the API Key you created, and replace model with the service ID you want to try.
cURL
Python
Node.js
Java
Go
curl -X POST 'https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions' \\
  -H 'Authorization: Bearer YOUR_API_KEY' \\
  -H 'Content-Type: application/json' \\
  -d '{
    "model": "deepseek-v3.2",
    "messages": [
      {"role": "user", "content": "Please introduce quantum computing."}
      {"role": "assistant", "content": "Quantum computing is a computational approach that leverages the principles of quantum mechanics for information processing..."},
      {"role": "user", "content": "What are the differences between it and traditional computing?"}
    ]
  }'
from openai import OpenAI
﻿
client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://tokenhub-intl.tencentcloudmaas.com/v1",
)
﻿
response = client.chat.completions.create(
    model="deepseek-v3.2",
    messages=[
        {"role": "user", "content": "Please introduce quantum computing."}
        {"role": "assistant", "content": "Quantum computing is a computational approach that leverages the principles of quantum mechanics for information processing..."},
        {"role": "user", "content": "What are the differences between it and traditional computing?"}
    ],
)
print(response.choices[0].message.content)
import OpenAI from 'openai';
﻿
const client = new OpenAI({
  apiKey: 'YOUR_API_KEY',
  baseURL: 'https://tokenhub-intl.tencentcloudmaas.com/v1',
});
﻿
const response = await client.chat.completions.create({
  model: 'deepseek-v3.2',
  messages: [
    {"role": "user", "content": "Please introduce quantum computing."}
    {"role": "assistant", "content": "Quantum computing is a computational approach that leverages the principles of quantum mechanics for information processing..."},
    {"role": "user", "content": "What are the differences between it and traditional computing?"}
  ],
});
console.log(response.choices[0].message.content);
import okhttp3.*;
import com.google.gson.Gson;
import java.util.*;
﻿
public class MultiTurn {
    public static void main(String[] args) throws Exception {
        Map<String, Object> body = new HashMap<>();
        body.put("model", "deepseek-v3.2");
        body.put("messages", Arrays.asList(
            Map.of("role", "user", "content", "Please introduce quantum computing."),
            Map.of("role", "assistant", "content", "Quantum computing is a computational approach that leverages the principles of quantum mechanics for information processing..."),
            Map.of("role", "user", "content", "What are the differences between it and traditional computing?")
        ));
﻿
        Request request = new Request.Builder()
            .url("https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions")
            .header("Authorization", "Bearer YOUR_API_KEY")
            .post(RequestBody.create(new Gson().toJson(body), MediaType.parse("application/json")))
            .build();
﻿
        try (Response response = new OkHttpClient().newCall(request).execute()) {
            System.out.println(response.body().string());
        }
    }
}
package main
﻿
import (
    "bytes"
    "encoding/json"
    "fmt"
    "io"
    "net/http"
)
﻿
func main() {
    body, _ := json.Marshal(map[string]interface{}{
        "model": "deepseek-v3.2",
        "messages": []map[string]string{
            {"role": "user", "content": "Please introduce quantum computing."}
            {"role": "assistant", "content": "Quantum computing is a computational approach that leverages the principles of quantum mechanics for information processing..."},
            {"role": "user", "content": "What are the differences between it and traditional computing?"}
        },
    })
﻿
    req, _ := http.NewRequest("POST",
        "https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions",
        bytes.NewBuffer(body))
    req.Header.Set("Authorization", "Bearer YOUR_API_KEY")
    req.Header.Set("Content-Type", "application/json")
﻿
    resp, _ := http.DefaultClient.Do(req)
    defer resp.Body.Close()
    data, _ := io.ReadAll(resp.Body)
    fmt.Println(string(data))
}
Response:
{
    "id": "fda59c08-6a85-4514-bdbf-d77a8d68e018",
    "object": "chat.completion",
    "model": "deepseek-v3.2",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "Good, this is a very core question. The fundamental difference between quantum computing and traditional computing lies in their basic units of information processing and their working principles.\\n\\nWe can start with a classic analogy:\\n\\n*   **A traditional computer** is like a **librarian** (CPU) running down a long corridor (bus) in a vast **library**. The librarian can only open one room (memory address) at a time, check one book (one bit of data), and then make a decision.\\n*   **A quantum computer**, on the other hand, is like having **all librarians** (qubits) **enter all rooms** simultaneously and read **every possible combination of all books** in an instant, then tell you the final result.\\n\\nBelow, we provide a detailed comparison from several key dimensions:\\n\\n### 1. Basic Information Unit: Bit vs. Qubit\\n\\n| Feature | Traditional Computing (Bit) | Quantum Computing (Qubit) |\\n| :--- | :--- | :--- |\\n| **State** | **Binary**: Can only be **0** or **1**. Like a light switch, it's either on or off. Very definite. | **Superposition**: Can be **both** 0 and 1 simultaneously, or any probabilistic combination of 0 and 1. Like a \\"quantum light\\" that is both on and off at the same time. |\\n| **Representation** | A definite, discrete value. | A state vector, represented in Dirac notation as: \\\\|ψ⟩ = α\\\\|0⟩ + β\\\\|1⟩, where α and β are complex numbers, and \\\\|α\\\\|² + \\\\|β\\\\|² = 1. |\\n| **Core Difference** | **Deterministic**: Each bit has a definite value at any given moment. | **Probabilistic**: When a qubit is measured, it collapses to 0 with probability \\\\|α\\\\|² and to 1 with probability \\\\|β\\\\|². |\\n\\n### 2. Working Principle: Logic Gates vs. Quantum Properties\\n\\n| Feature | Traditional Computing | Quantum Computing |\\n| :--- | :--- | :--- |\\n| **Operation Method** | Uses **logic gates** (e.g., AND, OR, NOT) to operate on bits. An operation changes the state of one or a group of bits. | Uses **quantum logic gates** to operate on qubits. These operations are **reversible** and can leverage superposition for **parallel computation**. |\\n| **Core Advantage** | **Serial Processing**: Tasks are broken down into a series of steps executed sequentially. Highly efficient for simple, logically clear tasks. | **Quantum Parallelism**: Because qubits are in superposition, a single quantum operation can **act on all possible inputs simultaneously**. This is the source of quantum speedup. |\\n| **Unique Phenomenon** | None | **Quantum Entanglement**: Two or more qubits can form a mysterious correlation. Regardless of distance, measuring one qubit instantly determines the state of the other(s). This allows a quantum computer to tightly link the states of different qubits for highly collaborative computation. |\\n\\n### 3. Performance and Applicable Domains\\n\\n| Feature | Traditional Computing | Quantum Computing |\\n| :--- | :--- | :--- |\\n| **Strong Suit** | - **General-purpose computing**: Office software, web browsing, games<br>- **Logic control**: Operating systems, application logic<br>- **Most data processing**: DMC, spreadsheets | - **Exponential speedup in specific domains**:<br>  - **Cryptography**: Breaking encryption algorithms like RSA (Shor's algorithm)<br>  - **Material simulation**: Precisely simulating the quantum properties of molecules and materials<br>  - **Optimization problems**: Logistics route planning, financial portfolio optimization<br>  - **Artificial intelligence**: Accelerating machine learning training |\\n| **Computational Complexity** | For certain complex problems (e.g., large number factorization), traditional algorithms require **exponentially** increasing time. | For specific problems, quantum algorithms can reduce complexity to the **polynomial** level, achieving \\"quantum supremacy.\\" |\\n| **Output Result** | Precise, deterministic results. | Typically **probabilistic** results. Because measurement is required, we obtain a potentially correct answer, so algorithms often need to run multiple times to increase confidence. |\\n\\n### 4. Physical Implementation and Challenges\\n\\n| Feature | Traditional Computer | Quantum Computer |\\n| :--- | :--- | :--- |\\n| **Hardware Foundation** | Based on **transistors** (semiconductors), mature technology, allowing for large-scale integration (e.g., CPUs with billions of transistors). | Requires physical systems that can maintain quantum states, such as superconducting circuits, ion traps, photonic qubits. Technology is still in its early stages. |\\n| **Main Challenge** | Power consumption, heat dissipation, transistor size approaching physical limits (Moore's Law slowing). | **Quantum Decoherence**: Quantum states are extremely fragile and easily lose their quantum properties due to environmental interference (e.g., heat, vibration). Requires extremely low temperatures (near absolute zero) and highly isolated environments. |\\n| **Error Correction** | Very low error rates, relatively simple error correction (e.g., parity check). | High error rates, requiring complex **quantum error-correcting codes** that use multiple physical qubits to encode one logical qubit, incurring significant overhead. |\\n\\n### Summary Table\\n\\n| Comparison Dimension | Traditional Computing | Quantum Computing |\\n| :--- | :--- | :--- |\\n| **Basic Unit** | Bit (0 or 1) | Qubit (Superposition: Superposition of 0 and 1) |\\n| **Operation Method** | Logic gates (serial) | Quantum gates (parallel) |\\n| **Core Principle** | Boolean logic | Superposition, entanglement, interference |\\n| **Result Output** | Deterministic | Probabilistic |\\n| **Strong Suit** | General tasks, logic control | Specific complex problems (e.g., simulation, optimization, cryptanalysis) |\\n| **Technology Maturity** | Very mature, widely used | Early stage, primarily used for research and specific computations |\\n| **Relationship with Users** | **Complementary Relationship**: Quantum computers are **not** intended to replace your phone or laptop. They function more like a **specialized accelerator** for solving specific, intractable problems that traditional computers cannot solve in the foreseeable future. In the future, we might access quantum computers via the cloud to handle the most complex parts, while traditional computers handle daily tasks and user interaction. |\\n\\nIn simple terms, a traditional computer is a \\"precise sharpshooter,\\" while a quantum computer is a \\"prophet capable of exploring all possibilities simultaneously.\\" Each has its strengths, and they will work together for a long time to come."
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 32,
        "completion_tokens": 1321,
        "total_tokens": 1353,
        "prompt_tokens_details": {
            "cached_tokens": 0
        },
        "completion_tokens_details": {
            "reasoning_tokens": 0
        }
    }
}
Example: Function Calling (Tool Calling)
Note:
Replace YOUR_API_KEY with the API Key you created, and replace model with the service ID you want to experience. For tool calls in thinking mode, you must provide the historical reasoning_content in each request round to obtain the best results. For details, see Interleaved Thinking.
cURL
Python
Node.js
Java
Go
curl -X POST 'https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions' \\
  -H 'Authorization: Bearer YOUR_API_KEY' \\
  -H 'Content-Type: application/json' \\
  -d '{
    "model": "deepseek-v3.2",
    "messages": [
      {"role": "user", "content": "What is the weather like in Beijing today?"}
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Obtain weather information for a specified city",
          "parameters": {
            "type": "object",
            "properties": {
              "city": {"type": "string", "description": "City name, such as: Beijing"}
            },
            "required": ["city"]
          }
        }
      }
    ],
    "tool_choice": "auto"
  }'
from openai import OpenAI
﻿
client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://tokenhub-intl.tencentcloudmaas.com/v1",
)
﻿
tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Obtain weather information for a specified city",
        "parameters": {
            "type": "object",
            "properties": {"city": {"type": "string", "description": "City name, such as: Beijing"}},
            "required": ["city"],
        },
    },
}]
﻿
response = client.chat.completions.create(
    model="deepseek-v3.2",
    messages = [{"role": "user", "content": "What is the weather like in Beijing today?"}],
    tools=tools,
    tool_choice="auto",
)
print(response.choices[0].message)
import OpenAI from 'openai';
﻿
const client = new OpenAI({
  apiKey: 'YOUR_API_KEY',
  baseURL: 'https://tokenhub-intl.tencentcloudmaas.com/v1',
});
﻿
const tools = [{
  type: 'function',
  function: {
    name: 'get_weather',
    "description": "Obtain weather information for a specified city",
    parameters: {
      type: 'object',
      "properties": {"city": {"type": "string", "description": "City name, such as: Beijing"}},
      required: ['city'],
    },
  },
}];
﻿
const response = await client.chat.completions.create({
  model: 'deepseek-v3.2',
  messages: [{ role: 'user', content: 'What is the weather like in Beijing today?' }],
  tools,
  tool_choice: 'auto',
});
console.log(response.choices[0].message);
import okhttp3.*;
import com.google.gson.Gson;
import java.util.*;
﻿
public class FunctionCalling {
    public static void main(String[] args) throws Exception {
        Map<String, Object> tool = Map.of(
            "type", "function",
            "function", Map.of(
                "name", "get_weather",
                "description", "Obtain weather information for a specified city",
                "parameters", Map.of(
                    "type", "object",
                    "properties", Map.of("city", Map.of("type", "string", "description", "City name, such as: Beijing")),
                    "required", List.of("city")
                )
            )
        );
﻿
        Map<String, Object> body = new HashMap<>();
        body.put("model", "deepseek-v3.2");
        body.put("messages", List.of(Map.of("role", "user", "content", "What is the weather like in Beijing today?")));
        body.put("tools", List.of(tool));
        body.put("tool_choice", "auto");
﻿
        Request request = new Request.Builder()
            .url("https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions")
            .header("Authorization", "Bearer YOUR_API_KEY")
            .post(RequestBody.create(new Gson().toJson(body), MediaType.parse("application/json")))
            .build();
﻿
        try (Response response = new OkHttpClient().newCall(request).execute()) {
            System.out.println(response.body().string());
        }
    }
}
package main
﻿
import (
    "bytes"
    "encoding/json"
    "fmt"
    "io"
    "net/http"
)
﻿
func main() {
    tool := map[string]interface{}{
        "type": "function",
        "function": map[string]interface{}{
            "name":        "get_weather",
            "description": "Obtain weather information for a specified city",
            "parameters": map[string]interface{}{
                "type": "object",
                "properties": map[string]interface{}{
                    "city": map[string]interface{}{"type": "string", "description": "City name, such as: Beijing"},
                },
                "required": []string{"city"},
            },
        },
    }
﻿
    body, _ := json.Marshal(map[string]interface{}{
        "model":       "deepseek-v3.2",
        "messages": []map[string]string{{"role": "user", "content": "What is the weather like in Beijing today?"}},
        "tools":       []map[string]interface{}{tool},
        "tool_choice": "auto",
    })
﻿
    req, _ := http.NewRequest("POST",
        "https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions",
        bytes.NewBuffer(body))
    req.Header.Set("Authorization", "Bearer YOUR_API_KEY")
    req.Header.Set("Content-Type", "application/json")
﻿
    resp, _ := http.DefaultClient.Do(req)
    defer resp.Body.Close()
    data, _ := io.ReadAll(resp.Body)
    fmt.Println(string(data))
}
When the model decides to call a tool, it returns:
{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\\"city\\": \\"Beijing\\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ]
}
Return the tool execution result to the model and continue the conversation:
{
  "model": "deepseek-v3.2",
  "messages": [
    {"role": "user", "content": "What is the weather like in Beijing today?"},
    {"role": "assistant", "content": null, "tool_calls": [{"id": "call_abc123", "type": "function", "function": {"name": "get_weather", "arguments": "{\\"city\\": \\"Beijing\\"}"}}]},
    {"role": "tool", "tool_call_id": "call_abc123", "content": "{\\"temperature\\": 22, \\"weather\\": \\"Sunny\\", \\"humidity\\": 45}"}
  ]
}
Anthropic API Usage
BaseUrl
Guangzhou: https://tokenhub.tencentcloudmaas.com
Singapore: https://tokenhub-intl.tencentcloudmaas.com
HTTP Headers
Field
Support Status
Description
anthropic-beta
Ignored.
Ignore the header.
anthropic-version
Ignored.
Ignore the header.
x-api-key
Fully supported
Used for authentication
Request Parameters
The following table lists the support status of the TokenHub gateway for the Anthropic protocol. For complete field definitions and the latest updates, see the Anthropic API official documentation.
Field
Support Status
Description
model
Supported
Replace with the Model(API Parameter) from the model list.
max_tokens
Fully supported
Maximum number of output tokens
container
Ignored.
Ignore this field
mcp_servers
Ignored.
Ignore this field
metadata
Ignored.
Ignore this field
service_tier
Ignored.
Ignore this field
stop_sequences
Fully supported
Stop sequences
stream
Fully supported
Streaming response
system
Fully supported
System message
temperature
Fully supported
Temperature parameter (0.0-2.0)
thinking
Ignored.
Ignore this field
top_k
Ignored.
Ignore this field
top_p
Fully supported
Top-p sampling
Tool Support
tools
Field
Support Status
Description
name
Fully supported
Tool Name
input_schema
Fully supported
Input parameter schema
description
Fully supported
Tool description
cache_control
Ignored.
Ignore this field
tool_choice
String format
Fully supported
tool_choice
Object format
Fully supported
tool_choice.disable_parallel_tool_use
Ignored.
Ignore this field
tool_choice
Field
Support Status
none
Fully supported
auto
Fully supported
any
Fully supported
tool
Fully supported
disable_parallel_tool_use
Ignored.
Message Field Support
Field Type
Variant
Subfield
Support Status
content
string
-
Fully supported
content
array, type="text"
text
Fully supported
content
array, type="text"
cache_control
Ignored.
content
array, type="text"
citations
Ignored.
content
array, type="image"
-
Supported by some models.
For details, refer to the invocation guide for each model.
content
array, type="document"
-
Not supported.
content
array, type="search_result"
-
Not supported.
content
array, type="thinking"
-
Ignored.
content
array, type="redacted_thinking"
-
Not supported.
content
array, type="tool_use"
id
Fully supported
content
array, type="tool_use"
input
Fully supported
content
array, type="tool_use"
name
Fully supported
content
array, type="tool_use"
cache_control
Ignored.
content
array, type="tool_result"
tool_use_id
Fully supported
content
array, type="tool_result"
content
Fully supported
content
array, type="tool_result"
cache_control
Ignored.
content
array, type="tool_result"
is_error
Ignored.
Note:
1. Ignored Fields: Certain Anthropic-specific fields are ignored, but no error is reported.
2. Parallel Tool Calls: The disable_parallel_tool_use parameter is ignored.
3. Cache Control: All cache_control-related fields are ignored.
Sample Code
Note:
Replace YOUR_API_KEY with the API Key you created, and replace model with the service ID you want to try.
cURL
Python
Node.js
Java
Go
curl https://tokenhub-intl.tencentcloudmaas.com/v1/messages \\
  -H "Content-Type: application/json" \\
  -H "x-api-key: YOUR_API_KEY" \\
  -d '{
    "model": "deepseek-v3.2",
    "max_tokens": 1000,
    "stream": true,
    "system": [
      {"type": "text", "text": "You are a helpful assistant."}
    ],
    "messages": [
      {"role": "user", "content": [{"type": "text", "text": "Hi, how are you?"}]}
    ]
  }'
import anthropic
﻿
client = anthropic.Anthropic(
    api_key="YOUR_API_KEY",
    base_url="https://tokenhub-intl.tencentcloudmaas.com",
)
﻿
with client.messages.stream(
    model="deepseek-v3.2",
    max_tokens=1000,
    system="You are a helpful assistant.",
    messages=[{"role": "user", "content": "Hi, how are you?"}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)
import Anthropic from '@anthropic-ai/sdk';
﻿
const client = new Anthropic({
  apiKey: 'YOUR_API_KEY',
  baseURL: 'https://tokenhub-intl.tencentcloudmaas.com',
});
﻿
const stream = await client.messages.stream({
  model: 'deepseek-v3.2',
  max_tokens: 1000,
  system: 'You are a helpful assistant.',
  messages: [{ role: 'user', content: 'Hi, how are you?' }],
});
﻿
for await (const event of stream) {
  if (event.type === 'content_block_delta' && event.delta.type === 'text_delta') {
    process.stdout.write(event.delta.text);
  }
}
import okhttp3.*;
import okhttp3.sse.*;
import com.google.gson.Gson;
import java.util.*;
﻿
public class AnthropicCall {
    public static void main(String[] args) {
        Map<String, Object> body = new HashMap<>();
        body.put("model", "deepseek-v3.2");
        body.put("max_tokens", 1000);
        body.put("stream", true);
        body.put("system", List.of(Map.of("type", "text", "text", "You are a helpful assistant.")));
        body.put("messages", List.of(Map.of(
            "role", "user",
            "content", List.of(Map.of("type", "text", "text", "Hi, how are you?"))
        )));
﻿
        Request request = new Request.Builder()
            .url("https://tokenhub-intl.tencentcloudmaas.com/v1/messages")
            .header("x-api-key", "YOUR_API_KEY")
            .header("Content-Type", "application/json")
            .post(RequestBody.create(new Gson().toJson(body), MediaType.parse("application/json")))
            .build();
﻿
        EventSources.createFactory(new OkHttpClient()).newEventSource(request,
            new EventSourceListener() {
                @Override public void onEvent(EventSource es, String id, String type, String data) {
                    System.out.println(data);
                }
            });
    }
}
package main
﻿
import (
    "bufio"
    "bytes"
    "encoding/json"
    "fmt"
    "net/http"
    "strings"
)
﻿
func main() {
    body, _ := json.Marshal(map[string]interface{}{
        "model":      "deepseek-v3.2",
        "max_tokens": 1000,
        "stream":     true,
        "system": []map[string]string{
            {"type": "text", "text": "You are a helpful assistant."},
        },
        "messages": []map[string]interface{}{
            {
                "role": "user",
                "content": []map[string]string{
                    {"type": "text", "text": "Hi, how are you?"},
                },
            },
        },
    })
﻿
    req, _ := http.NewRequest("POST",
        "https://tokenhub-intl.tencentcloudmaas.com/v1/messages",
        bytes.NewBuffer(body))
    req.Header.Set("x-api-key", "YOUR_API_KEY")
    req.Header.Set("Content-Type", "application/json")
﻿
    resp, _ := http.DefaultClient.Do(req)
    defer resp.Body.Close()
﻿
    scanner := bufio.NewScanner(resp.Body)
    for scanner.Scan() {
        line := scanner.Text()
        if strings.HasPrefix(line, "data: ") {
            fmt.Println(strings.TrimPrefix(line, "data: "))
        }
    }
}
Response:
data: {"content_block":{"text":"","type":"text"},"index":1,"type":"content_block_start"}
﻿
event: content_block_delta
data: {"delta":{"text":"Hey","type":"text_delta"},"index":0,"type":"content_block_delta"}
﻿
event: content_block_delta
data: {"delta":{"text":"! I'm doing well, thanks for asking! I'm","type":"text_delta"},"index":0,"type":"content_block_delta"}
﻿
event: content_block_delta
data: {"delta":{"text":" here and ready to help with whatever you need.","type":"text_delta"},"index":0,"type":"content_block_delta"}
﻿
event: content_block_delta
data: {"delta":{"text":" How are you doing today? Is there something I","type":"text_delta"},"index":0,"type":"content_block_delta"}
﻿
event: content_block_delta
data: {"delta":{"text":" can assist you with?","type":"text_delta"},"index":0,"type":"content_block_delta"}
﻿
event: content_block_stop
data: {"index":1,"type":"content_block_stop"}
﻿
event: message_delta
data: {"delta":{"stop_reason":"end_turn","stop_sequence":null},"type":"message_delta","usage":{"output_tokens":57}}
﻿
event: message_stop
data: {"type":"message_stop"}
Integrating the Model with Claude Code
Installing Claude Code
To install or update Anthropic Claude Code, run the following command:
npm install -g @anthropic-ai/claude-code
Configuring Environment Variables
export ANTHROPIC_BASE_URL=https://tokenhub-intl.tencentcloudmaas.com
export ANTHROPIC_AUTH_TOKEN=${API_KEY}
export API_TIMEOUT_MS=600000
export ANTHROPIC_MODEL=${MODEL_NAME}
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
Note:
API_TIMEOUT_MS is configured to prevent the Claude Code client from timing out due to lengthy outputs. The timeout duration set here is 10 minutes, and users can adjust it as needed.
Executing the claude Command
Go to the project directory, run the claude command, and you can start using it.
cd my-project
claude
﻿
﻿

ヘルプとサポート

この記事はお役に立ちましたか？

営業担当者にお問い合わせいただくかチケットを提出してサポートを求めることができます。

フィードバック

tencent cloud

LLM Service TokenHub

Language Model API Overview

Overview

Supported Protocols by Model

Using the OpenAI API

Base URL

Request Parameters

Messages Parameter

Response Parameters

Choices Array Elements

Usage Object

Sample Code

Example: Basic Conversation

Example: Streaming Output

Example: System Prompt

Example: Multi-Turn Conversation

Example: Function Calling (Tool Calling)

Anthropic API Usage

BaseUrl

HTTP Headers

Request Parameters

Tool Support

tools

tool_choice

Message Field Support

Sample Code

Integrating the Model with Claude Code

Installing Claude Code

Configuring Environment Variables

Executing the claude Command

ヘルプとサポート

Model Name	Model (API Parameter)	OpenAI Chat Completions	OpenAI Responses	Anthropic
DeepSeek-V4-Flash (Vendor Direct)	`deepseek-v4-flash-202605`	✅	Compatible*	✅
DeepSeek-V4-Pro (Vendor Direct)	`deepseek-v4-pro-202606`	✅	Compatible*	✅
DeepSeek-V4-Flash	`deepseek-v4-flash`	✅	Compatible*	✅
DeepSeek-V4-Pro	`deepseek-v4-pro`	✅	Compatible*	✅
Deepseek-v3.2	`deepseek-v3.2`	✅	❌	✅
GLM-5.2	`glm-5.2`	✅	Compatible*	✅
GLM-5.1	`glm-5.1`	✅	❌	✅
GLM-5V-Turbo	`glm-5v-turbo`	✅	❌	✅
GLM-5-Turbo	`glm-5-turbo`	✅	❌	✅
GLM-5	`glm-5`	✅	❌	✅
Kimi K2.7 Code	`kimi-k2.7-code`	✅	❌	✅
Kimi-K2.6	`kimi-k2.6`	✅	❌	✅
Kimi-K2.5	`kimi-k2.5`	✅	❌	✅
MiniMax-M3	`minimax-m3`	✅	✅	✅
MiniMax-M2.7	`minimax-m2.7`	✅	✅	✅
MiniMax-M2.5	`minimax-m2.5`	✅	✅	✅
Hy-MT2-Plus	`hy-mt2-plus`	✅	❌	✅

Parameter Name	Required	Type	Description
`model`	Yes	String	Service ID. For platform-provided services, the service ID is the same as the model name.(for example, `deepseek-v3.2`). For the complete list, see the Supported Protocols by Model column in Model-Supported Protocol Overview. For user-created custom services, the service ID follows the format `ep-xxxxxxxx` and can be viewed on the online inference service page.
`messages`	Yes	Array	An array of chat context messages. For details, see the messages parameter.
`stream`	No	Boolean	Whether to enable streaming output. Valid values: `true` / `false`. The default value is `false`.
`stream_options`	No	Object	Streaming output options. A common setting: `{"include_usage": true}` causes the last chunk to carry the `usage` statistics field (effective only when `stream=true`).
`temperature`	No	Float	Sampling temperature, which controls output randomness. Valid values: `[0.0, 2.0]`. The default value is `1.0`. A higher value results in more random output. Some models have specific value constraints. See the dedicated documentation for the corresponding model.
`top_p`	No	Float	Probability threshold for Nucleus Sampling. Valid values: `[0.0, 1.0]`. The default value is `1.0`. It is recommended to use it as an alternative to `temperature`.
`max_tokens`	No	Integer	Limits the maximum number of output tokens per response. Reasoning tokens and response tokens of thinking models share this quota. It is recommended to increase it appropriately.
`n`	No	Integer	The number of candidate responses generated for a single request. The default value is `1`. Note: Billing is based on the total number of tokens when `n > 1`.
`stop`	No	String or Array of String	Specifies the stop sequences for model output. When the generated result matches any of the specified sequences, the model stops outputting, and the response content does not include that stop sequence. It supports passing a single string or an array of strings, with a maximum of 4. For example, to have the model generate a list of 10 items and prevent it from continuing to write the 11th item, you can fill in this field with: `["11."]`.
`seed`	No	Integer	Random seed, used for result reproducibility. When the same `seed` value is used across multiple requests and other parameters remain consistent, the model is more likely to return identical or very similar results.
`frequency_penalty`	No	Float	Frequency penalty. Valid values range from `-2.0 to 2.0`. The default value is `0`. A positive value reduces the probability of tokens that have already appeared frequently being selected again, which can help mitigate repetitive content.
`presence_penalty`	No	Float	Presence penalty. Valid values range from `-2.0 to 2.0`. The default value is `0`. A positive value encourages the model to discuss new topics (based only on whether a Token has appeared, not on its frequency).
`logit_bias`	No	Map	Modifies the probability of specific tokens appearing in the result. The key is the token ID, and the value is a bias within the range of `-100 to 100`. A value of `-100` disables the token, and a value of `100` forces its use.
`logprobs`	No	Boolean	Whether to return the log probabilities of output tokens. The default value is `false`.
`top_logprobs`	No	Integer	Returns the N tokens with the highest probability at each position. Valid values range from `0 to 20`. `logprobs=true` must be set concurrently.
`response_format`	No	Object	Specifies the response output format. Common values: `{"type": "text"}`: Text output by default. `{"type": "json_object"}`: JSON mode, which forces the output of valid JSON. `{"type": "json_schema", "json_schema": {...}}`: Structured output, constrained by the specified Schema.
`tools`	No	Array	A list of Function Calling tool definitions. Each tool contains a `type: "function"` and a `function` object (which includes `name` / `description` / `parameters`).
`tool_choice`	No	String or Object	Tool invocation policy: `"none"`: Tool calls are prohibited. `"auto"`: Automatically determines whether to call (default). `"required"`: Forces the call of any tool. `{"type": "function", "function": {"name": "xxx"}}`: Forces the call of the specified tool.
`parallel_tool_calls`	No	Boolean	Whether to allow multiple tools to be invoked in parallel within a single response. The default value is `true`. Setting it to `false` forces tools to be invoked serially, which facilitates debugging.
`thinking`	No	Object	Controls the thinking mode. The default value varies across different models. For details, see Deep Thinking. Valid values: `{"type": "enabled"}` / `{"type": "disabled"}`.
`reasoning_effort`	No	String	Controls the reasoning depth. It takes effect only on thinking models, and the default value varies across different models. For details, see Deep Thinking. Valid values: `low` / `medium` / `high`.
`user`	No	String	A stable identifier for end users, facilitating auditing and troubleshooting.

Field	Type	Description
role	String	Role: `system` (system prompt), `user` (user), `assistant` (assistant), `tool` (tool response)
content	String	Text content of the message.

Field	Support Status	Description
anthropic-beta	Ignored.	Ignore the header.
anthropic-version	Ignored.	Ignore the header.
x-api-key	Fully supported	Used for authentication

Field	Support Status
none	Fully supported
auto	Fully supported
any	Fully supported
tool	Fully supported
disable_parallel_tool_use	Ignored.

Field Type	Variant	Subfield	Support Status
content	string	-	Fully supported
content	array, type="text"	text	Fully supported
content	array, type="text"	cache_control	Ignored.
content	array, type="text"	citations	Ignored.
content	array, type="image"	-	Supported by some models. For details, refer to the invocation guide for each model.
content	array, type="document"	-	Not supported.
content	array, type="search_result"	-	Not supported.
content	array, type="thinking"	-	Ignored.
content	array, type="redacted_thinking"	-	Not supported.
content	array, type="tool_use"	id	Fully supported
content	array, type="tool_use"	input	Fully supported
content	array, type="tool_use"	name	Fully supported
content	array, type="tool_use"	cache_control	Ignored.
content	array, type="tool_result"	tool_use_id	Fully supported
content	array, type="tool_result"	content	Fully supported
content	array, type="tool_result"	cache_control	Ignored.
content	array, type="tool_result"	is_error	Ignored.