tencent cloud

문서大模型服务平台 TokenHub

Language Model API Overview

다운로드
포커스 모드
폰트 크기
마지막 업데이트 시간: 2026-06-12 21:43:11

Overview

The TokenHub platform aggregates language models from multiple providers, including DeepSeek, GLM, Kimi, and MiniMax, covering scenarios such as conversational interaction, content creation, code generation, and reasoning analysis. All models uniformly support both the OpenAI Chat Completions API and Anthropic Messages API protocols. You can directly connect using the OpenAI SDK, Anthropic SDK, or any compatible client.

Supported Protocols by Model

Model Name
Model (API Parameter)
OpenAI Chat Completions
OpenAI Responses
Anthropic
DeepSeek-V4-Flash (Vendor Direct)
deepseek-v4-flash-202605
DeepSeek-V4-Pro (Vendor Direct)
deepseek-v4-pro-202606
DeepSeek-V4-Flash
deepseek-v4-flash
DeepSeek-V4-Pro
deepseek-v4-pro
Deepseek-v3.2
deepseek-v3.2
GLM-5.1
glm-5.1
GLM-5V-Turbo
glm-5v-turbo
GLM-5-Turbo
glm-5-turbo
GLM-5
glm-5
Kimi-K2.6
kimi-k2.6
Kimi-K2.5
kimi-k2.5
MiniMax-M3
minimax-m3
MiniMax-M2.7
minimax-m2.7
MiniMax-M2.5
minimax-m2.5
Hy-MT2-Plus
hy-mt2-plus

Using the OpenAI API

Base URL

Guangzhou: https://tokenhub.tencentcloudmaas.com/v1
Singapore: https://tokenhub-intl.tencentcloudmaas.com/v1

Request Parameters

The following table lists the common request parameters supported by the TokenHub gateway. For complete field definitions and the latest updates, see the OpenAI API official documentation.
Parameter Name
Required
Type
Description
model
Yes
String
Service ID.
For platform-provided services, the service ID is the same as the model name.(for example, deepseek-v3.2). For the complete list, see the Supported Protocols by Model column in Model-Supported Protocol Overview.
For user-created custom services, the service ID follows the format ep-xxxxxxxx and can be viewed on the online inference service page.
messages
Yes
Array
An array of chat context messages. For details, see the messages parameter.
stream
No
Boolean
Whether to enable streaming output.
Valid values: true / false. The default value is false.
stream_options
No
Object
Streaming output options. A common setting: {"include_usage": true} causes the last chunk to carry the usage statistics field (effective only when stream=true).
temperature
No
Float
Sampling temperature, which controls output randomness.
Valid values: [0.0, 2.0]. The default value is 1.0. A higher value results in more random output.
Some models have specific value constraints. See the dedicated documentation for the corresponding model.
top_p
No
Float
Probability threshold for Nucleus Sampling.
Valid values: [0.0, 1.0]. The default value is 1.0. It is recommended to use it as an alternative to temperature.
max_tokens
No
Integer
Limits the maximum number of output tokens per response. Reasoning tokens and response tokens of thinking models share this quota. It is recommended to increase it appropriately.
n
No
Integer
The number of candidate responses generated for a single request. The default value is 1.
Note: Billing is based on the total number of tokens when n > 1.
stop
No
String or Array of String
Specifies the stop sequences for model output. When the generated result matches any of the specified sequences, the model stops outputting, and the response content does not include that stop sequence. It supports passing a single string or an array of strings, with a maximum of 4.
For example, to have the model generate a list of 10 items and prevent it from continuing to write the 11th item, you can fill in this field with: ["11."].
seed
No
Integer
Random seed, used for result reproducibility. When the same seed value is used across multiple requests and other parameters remain consistent, the model is more likely to return identical or very similar results.
frequency_penalty
No
Float
Frequency penalty. Valid values range from -2.0 to 2.0. The default value is 0. A positive value reduces the probability of tokens that have already appeared frequently being selected again, which can help mitigate repetitive content.
presence_penalty
No
Float
Presence penalty. Valid values range from -2.0 to 2.0. The default value is 0. A positive value encourages the model to discuss new topics (based only on whether a Token has appeared, not on its frequency).
logit_bias
No
Map
Modifies the probability of specific tokens appearing in the result. The key is the token ID, and the value is a bias within the range of -100 to 100. A value of -100 disables the token, and a value of 100 forces its use.
logprobs
No
Boolean
Whether to return the log probabilities of output tokens. The default value is false.
top_logprobs
No
Integer
Returns the N tokens with the highest probability at each position. Valid values range from 0 to 20. logprobs=true must be set concurrently.
response_format
No
Object
Specifies the response output format. Common values:
{"type": "text"}: Text output by default.
{"type": "json_object"}: JSON mode, which forces the output of valid JSON.
{"type": "json_schema", "json_schema": {...}}: Structured output, constrained by the specified Schema.
tools
No
Array
A list of Function Calling tool definitions. Each tool contains a type: "function" and a function object (which includes name / description / parameters).
tool_choice
No
String or Object
Tool invocation policy:
"none": Tool calls are prohibited.
"auto": Automatically determines whether to call (default).
"required": Forces the call of any tool.
{"type": "function", "function": {"name": "xxx"}}: Forces the call of the specified tool.
parallel_tool_calls
No
Boolean
Whether to allow multiple tools to be invoked in parallel within a single response. The default value is true. Setting it to false forces tools to be invoked serially, which facilitates debugging.
thinking
No
Object
Controls the thinking mode. The default value varies across different models. For details, see Deep Thinking.
Valid values: {"type": "enabled"} / {"type": "disabled"}.
reasoning_effort
No
String
Controls the reasoning depth. It takes effect only on thinking models, and the default value varies across different models. For details, see Deep Thinking.
Valid values: low / medium / high.
user
No
String
A stable identifier for end users, facilitating auditing and troubleshooting.

Messages Parameter

Each object in the message array contains the following fields:
Field
Type
Description
role
String
Role: system (system prompt), user (user), assistant (assistant), tool (tool response)
content
String
Text content of the message.
Message Sequence Rule: [system(optional) → user → assistant → user → ...], and must end with the user role.

Response Parameters

Parameter Name
Type
Description
id
String
The unique identifier of the request.
object
String
The object type, fixed as chat.completion.
created
Integer
Creation time (Unix timestamp).
model
String
The name of the model actually used.
choices
Array
The list of candidate results returned by the model for a single request. For details, see choices array element.
usage
Object
Token consumption statistics.

Choices Array Elements

Field
Type
Description
index
Integer
Option index.
message
Object
Response message containing role and content
finish_reason
String
Reason for termination: stop (normal termination), length (maximum length reached), tool_calls (tool invocation required).

Usage Object

Field
Type
Description
prompt_tokens
Integer
Number of input tokens
completion_tokens
Integer
Number of output tokens
total_tokens
Integer
Total number of tokens (used for billing)

Sample Code

Note:
This document is the Common Invocation Guide for All Language Models. Different models may have slight variations in aspects such as thinking mode toggle, reasoning field returns, multimodal formats, and special parameter values. Please also refer to the dedicated documentation for the corresponding model:
DeepSeek Model: DeepSeek API Guide
GLM Model: GLM API Guide
Kimi Model: Kimi API Guide
MiniMax Model: MiniMax API Guide

Example: Basic Conversation

Note:
Replace YOUR_API_KEY with the API Key you created, and replace model with the service ID you want to try.
cURL
Python
Node.js
Java
Go
curl -X POST 'https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions' \\
-H 'Authorization: Bearer YOUR_API_KEY' \\
-H 'Content-Type: application/json' \\
-d '{
"model": "deepseek-v3.2",
"messages": [
{"role": "user", "content": "Hello, please introduce yourself"}
]
}'
from openai import OpenAI

client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://tokenhub-intl.tencentcloudmaas.com/v1",
)

response = client.chat.completions.create(
model="deepseek-v3.2",
messages=[
{"role": "user", "content": "Hello, please introduce yourself"},
],
)
print(response.choices[0].message.content)
import OpenAI from 'openai';

const client = new OpenAI({
apiKey: 'YOUR_API_KEY',
baseURL: 'https://tokenhub-intl.tencentcloudmaas.com/v1',
});

const response = await client.chat.completions.create({
model: 'deepseek-v3.2',
messages: [
{ role: 'user', content: 'Hello, please introduce yourself' },
],
});
console.log(response.choices[0].message.content);
// Using the OpenAI-compatible protocol, call the HTTP API directly with OkHttp
import okhttp3.*;
import com.google.gson.Gson;
import java.util.*;

public class BasicChat {
public static void main(String[] args) throws Exception {
Map<String, Object> body = new HashMap<>();
body.put("model", "deepseek-v3.2");
body.put("messages", Arrays.asList(
Map.of("role", "user", "content", "Hello, please introduce yourself")
));

RequestBody requestBody = RequestBody.create(
new Gson().toJson(body),
MediaType.parse("application/json")
);

Request request = new Request.Builder()
.url("https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions")
.header("Authorization", "Bearer YOUR_API_KEY")
.post(requestBody)
.build();

try (Response response = new OkHttpClient().newCall(request).execute()) {
System.out.println(response.body().string());
}
}
}
package main

import (
"bytes"
"encoding/json"
"fmt"
"io"
"net/http"
)

func main() {
body := map[string]interface{}{
"model": "deepseek-v3.2",
"messages": []map[string]string{
{"role": "user", "content": "Hello, please introduce yourself"},
},
}
payload, _ := json.Marshal(body)

req, _ := http.NewRequest("POST",
"https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions",
bytes.NewBuffer(payload))
req.Header.Set("Authorization", "Bearer YOUR_API_KEY")
req.Header.Set("Content-Type", "application/json")

resp, err := http.DefaultClient.Do(req)
if err != nil {
panic(err)
}
defer resp.Body.Close()

data, _ := io.ReadAll(resp.Body)
fmt.Println(string(data))
}
Response:
{
"id": "5e9c7ae9-e0e4-4ec1-bbd0-22bcfda61e45",
"object": "chat.completion",
"model": "deepseek-v3.2",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! Nice to meet you! 😊\\n\\nI am DeepSeek, an AI assistant created by DeepSeek Company. Let me briefly introduce myself:\\n\\n**My Features:**\\n- 📚 My knowledge is up to date as of July 2024, and I am the latest version of the DeepSeek model.\\n- 💬 I am a pure text conversation model, focused on understanding and generating textual content.\\n- 📁 I support file uploads—I can process images, txt, pdf, ppt, word, excel, and other files, and read text information from them.\\n- 🌐 I support web search (you need to manually enable it in the Web/App).\\n- 💾 I have a 128K context length, allowing me to remember our longer conversations.\\n\\n**What I can do for you:**\\n- Answer various questions and engage in in-depth discussions.\\n- Assist with writing, translation, and analysis.\\n- Process uploaded document content.\\n- Provide suggestions for learning, work, and life.\\n\\n**Important Notes:**\\n- I am completely free to use, with no paid plans.\\n- I currently do not support voice features.\\n- You can download the App from official app stores.\\n\\nMy response style is warm and detailed, and I hope to provide you with a pleasant communication experience! If you have anything to talk about or need help with, just let me know! ✨"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 244,
"total_tokens": 254,
"prompt_tokens_details": {
"cached_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0
}
}
}

Example: Streaming Output

Note:
Replace YOUR_API_KEY with the API Key you created, and replace model with the service ID you want to try.
cURL
Python
Node.js
Java
Go
curl -X POST 'https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions' \\
-H 'Authorization: Bearer YOUR_API_KEY' \\
-H 'Content-Type: application/json' \\
-d '{
"model": "deepseek-v3.2",
"messages": [
{"role": "system", "content": "You are a helpful AI assistant."},
{"role": "user", "content": "Calculate 1+1"}
],
"stream": true
}'
from openai import OpenAI

client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://tokenhub-intl.tencentcloudmaas.com/v1",
)

stream = client.chat.completions.create(
model="deepseek-v3.2",
messages=[
{"role": "system", "content": "You are a helpful AI assistant."},
{"role": "user", "content": "Calculate 1+1"},
],
stream=True,
)
for chunk in stream:
if chunk.choices and chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
import OpenAI from 'openai';

const client = new OpenAI({
apiKey: 'YOUR_API_KEY',
baseURL: 'https://tokenhub-intl.tencentcloudmaas.com/v1',
});

const stream = await client.chat.completions.create({
model: 'deepseek-v3.2',
messages: [
{ role: 'system', content: 'You are a helpful AI assistant.' },
{ role: 'user', content: 'Calculate 1+1' },
],
stream: true,
});

for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || '');
}
// For streaming calls based on SSE, use OkHttp to receive line-by-line responses.
import okhttp3.*;
import okhttp3.sse.*;
import com.google.gson.Gson;
import java.util.*;

public class Streaming {
public static void main(String[] args) {
Map<String, Object> body = new HashMap<>();
body.put("model", "deepseek-v3.2");
body.put("messages", Arrays.asList(
Map.of("role", "system", "content", "You are a helpful AI assistant."),
Map.of("role", "user", "content", "Calculate 1+1")
));
body.put("stream", true);

Request request = new Request.Builder()
.url("https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions")
.header("Authorization", "Bearer YOUR_API_KEY")
.post(RequestBody.create(new Gson().toJson(body), MediaType.parse("application/json")))
.build();

EventSources.createFactory(new OkHttpClient()).newEventSource(request,
new EventSourceListener() {
@Override public void onEvent(EventSource es, String id, String type, String data) {
if (!"[DONE]".equals(data)) System.out.print(data);
}
});
}
}
package main

import (
"bufio"
"bytes"
"encoding/json"
"fmt"
"net/http"
"strings"
)

func main() {
body, _ := json.Marshal(map[string]interface{}{
"model": "deepseek-v3.2",
"messages": []map[string]string{
{"role": "system", "content": "You are a helpful AI assistant."},
{"role": "user", "content": "Calculate 1+1"},
},
"stream": true,
})

req, _ := http.NewRequest("POST",
"https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions",
bytes.NewBuffer(body))
req.Header.Set("Authorization", "Bearer YOUR_API_KEY")
req.Header.Set("Content-Type", "application/json")

resp, _ := http.DefaultClient.Do(req)
defer resp.Body.Close()

scanner := bufio.NewScanner(resp.Body)
for scanner.Scan() {
line := scanner.Text()
if strings.HasPrefix(line, "data: ") && line != "data: [DONE]" {
fmt.Println(strings.TrimPrefix(line, "data: "))
}
}
}
Streaming responses use the Server-Sent Events SSE (Server-Sent Events) format:
data: {"id":"chatcmpl-abc123","choices":[{"index":0,"delta":{"role":"assistant","content":"1"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","choices":[{"index":0,"delta":{"content":"+"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","choices":[{"index":0,"delta":{"content":"1"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","choices":[{"index":0,"delta":{"content":"="},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","choices":[{"index":0,"delta":{"content":"2"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Example: System Prompt

Note:
Replace YOUR_API_KEY with the API Key you created, and replace model with the service ID you want to try.
cURL
Python
Node.js
Java
Go
curl -X POST 'https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions' \\
-H 'Authorization: Bearer YOUR_API_KEY' \\
-H 'Content-Type: application/json' \\
-d '{
"model": "deepseek-v3.2",
"messages": [
{"role": "system", "content": "You are a professional English translation assistant. Translate user-input Chinese into English, and translate English into Chinese. Return only the translation result, without any explanation."},
{"role": "user", "content": "The weather is really nice today."}
]
}'
from openai import OpenAI

client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://tokenhub-intl.tencentcloudmaas.com/v1",
)

response = client.chat.completions.create(
model="deepseek-v3.2",
messages=[
{"role": "system", "content": "You are a professional English translation assistant. Translate user-input Chinese into English, and translate English into Chinese. Return only the translation result, without any explanation."},
{"role": "user", "content": "The weather is really nice today."}
],
)
print(response.choices[0].message.content)
import OpenAI from 'openai';

const client = new OpenAI({
apiKey: 'YOUR_API_KEY',
baseURL: 'https://tokenhub-intl.tencentcloudmaas.com/v1',
});

const response = await client.chat.completions.create({
model: 'deepseek-v3.2',
messages: [
{ role: 'system', content: 'You are a professional English translation assistant. Translate user-input Chinese into English, and translate English into Chinese. Return only the translation result, without any explanation.' },
{ role: 'user', content: 'The weather is really nice today.' },
],
});
console.log(response.choices[0].message.content);
import okhttp3.*;
import com.google.gson.Gson;
import java.util.*;

public class SystemPrompt {
public static void main(String[] args) throws Exception {
Map<String, Object> body = new HashMap<>();
body.put("model", "deepseek-v3.2");
body.put("messages", Arrays.asList(
Map.of("role", "system", "content", "You are a professional English translation assistant. Translate user-input Chinese into English, and translate English into Chinese. Return only the translation result, without any explanation."),
Map.of("role", "user", "content", "The weather is really nice today.")
));

Request request = new Request.Builder()
.url("https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions")
.header("Authorization", "Bearer YOUR_API_KEY")
.post(RequestBody.create(new Gson().toJson(body), MediaType.parse("application/json")))
.build();

try (Response response = new OkHttpClient().newCall(request).execute()) {
System.out.println(response.body().string());
}
}
}
package main

import (
"bytes"
"encoding/json"
"fmt"
"io"
"net/http"
)

func main() {
body, _ := json.Marshal(map[string]interface{}{
"model": "deepseek-v3.2",
"messages": []map[string]string{
{"role": "system", "content": "You are a professional English translation assistant. Translate user-input Chinese into English, and translate English into Chinese. Return only the translation result, without any explanation."},
{"role": "user", "content": "The weather is really nice today."}
},
})

req, _ := http.NewRequest("POST",
"https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions",
bytes.NewBuffer(body))
req.Header.Set("Authorization", "Bearer YOUR_API_KEY")
req.Header.Set("Content-Type", "application/json")

resp, _ := http.DefaultClient.Do(req)
defer resp.Body.Close()
data, _ := io.ReadAll(resp.Body)
fmt.Println(string(data))
}
Response:
{
"id": "5d42fea3-413e-42ce-99b2-0d1595dae996",
"object": "chat.completion",
"model": "deepseek-v3.2",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The weather is really nice today."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 38,
"completion_tokens": 7,
"total_tokens": 45,
"prompt_tokens_details": {
"cached_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0
}
}
}

Example: Multi-Turn Conversation

Note:
Replace YOUR_API_KEY with the API Key you created, and replace model with the service ID you want to try.
cURL
Python
Node.js
Java
Go
curl -X POST 'https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions' \\
-H 'Authorization: Bearer YOUR_API_KEY' \\
-H 'Content-Type: application/json' \\
-d '{
"model": "deepseek-v3.2",
"messages": [
{"role": "user", "content": "Please introduce quantum computing."}
{"role": "assistant", "content": "Quantum computing is a computational approach that leverages the principles of quantum mechanics for information processing..."},
{"role": "user", "content": "What are the differences between it and traditional computing?"}
]
}'
from openai import OpenAI

client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://tokenhub-intl.tencentcloudmaas.com/v1",
)

response = client.chat.completions.create(
model="deepseek-v3.2",
messages=[
{"role": "user", "content": "Please introduce quantum computing."}
{"role": "assistant", "content": "Quantum computing is a computational approach that leverages the principles of quantum mechanics for information processing..."},
{"role": "user", "content": "What are the differences between it and traditional computing?"}
],
)
print(response.choices[0].message.content)
import OpenAI from 'openai';

const client = new OpenAI({
apiKey: 'YOUR_API_KEY',
baseURL: 'https://tokenhub-intl.tencentcloudmaas.com/v1',
});

const response = await client.chat.completions.create({
model: 'deepseek-v3.2',
messages: [
{"role": "user", "content": "Please introduce quantum computing."}
{"role": "assistant", "content": "Quantum computing is a computational approach that leverages the principles of quantum mechanics for information processing..."},
{"role": "user", "content": "What are the differences between it and traditional computing?"}
],
});
console.log(response.choices[0].message.content);
import okhttp3.*;
import com.google.gson.Gson;
import java.util.*;

public class MultiTurn {
public static void main(String[] args) throws Exception {
Map<String, Object> body = new HashMap<>();
body.put("model", "deepseek-v3.2");
body.put("messages", Arrays.asList(
Map.of("role", "user", "content", "Please introduce quantum computing."),
Map.of("role", "assistant", "content", "Quantum computing is a computational approach that leverages the principles of quantum mechanics for information processing..."),
Map.of("role", "user", "content", "What are the differences between it and traditional computing?")
));

Request request = new Request.Builder()
.url("https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions")
.header("Authorization", "Bearer YOUR_API_KEY")
.post(RequestBody.create(new Gson().toJson(body), MediaType.parse("application/json")))
.build();

try (Response response = new OkHttpClient().newCall(request).execute()) {
System.out.println(response.body().string());
}
}
}
package main

import (
"bytes"
"encoding/json"
"fmt"
"io"
"net/http"
)

func main() {
body, _ := json.Marshal(map[string]interface{}{
"model": "deepseek-v3.2",
"messages": []map[string]string{
{"role": "user", "content": "Please introduce quantum computing."}
{"role": "assistant", "content": "Quantum computing is a computational approach that leverages the principles of quantum mechanics for information processing..."},
{"role": "user", "content": "What are the differences between it and traditional computing?"}
},
})

req, _ := http.NewRequest("POST",
"https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions",
bytes.NewBuffer(body))
req.Header.Set("Authorization", "Bearer YOUR_API_KEY")
req.Header.Set("Content-Type", "application/json")

resp, _ := http.DefaultClient.Do(req)
defer resp.Body.Close()
data, _ := io.ReadAll(resp.Body)
fmt.Println(string(data))
}
Response:
{
"id": "fda59c08-6a85-4514-bdbf-d77a8d68e018",
"object": "chat.completion",
"model": "deepseek-v3.2",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Good, this is a very core question. The fundamental difference between quantum computing and traditional computing lies in their basic units of information processing and their working principles.\\n\\nWe can start with a classic analogy:\\n\\n* **A traditional computer** is like a **librarian** (CPU) running down a long corridor (bus) in a vast **library**. The librarian can only open one room (memory address) at a time, check one book (one bit of data), and then make a decision.\\n* **A quantum computer**, on the other hand, is like having **all librarians** (qubits) **enter all rooms** simultaneously and read **every possible combination of all books** in an instant, then tell you the final result.\\n\\nBelow, we provide a detailed comparison from several key dimensions:\\n\\n### 1. Basic Information Unit: Bit vs. Qubit\\n\\n| Feature | Traditional Computing (Bit) | Quantum Computing (Qubit) |\\n| :--- | :--- | :--- |\\n| **State** | **Binary**: Can only be **0** or **1**. Like a light switch, it's either on or off. Very definite. | **Superposition**: Can be **both** 0 and 1 simultaneously, or any probabilistic combination of 0 and 1. Like a \\"quantum light\\" that is both on and off at the same time. |\\n| **Representation** | A definite, discrete value. | A state vector, represented in Dirac notation as: \\\\|ψ⟩ = α\\\\|0⟩ + β\\\\|1⟩, where α and β are complex numbers, and \\\\|α\\\\|² + \\\\|β\\\\|² = 1. |\\n| **Core Difference** | **Deterministic**: Each bit has a definite value at any given moment. | **Probabilistic**: When a qubit is measured, it collapses to 0 with probability \\\\|α\\\\|² and to 1 with probability \\\\|β\\\\|². |\\n\\n### 2. Working Principle: Logic Gates vs. Quantum Properties\\n\\n| Feature | Traditional Computing | Quantum Computing |\\n| :--- | :--- | :--- |\\n| **Operation Method** | Uses **logic gates** (e.g., AND, OR, NOT) to operate on bits. An operation changes the state of one or a group of bits. | Uses **quantum logic gates** to operate on qubits. These operations are **reversible** and can leverage superposition for **parallel computation**. |\\n| **Core Advantage** | **Serial Processing**: Tasks are broken down into a series of steps executed sequentially. Highly efficient for simple, logically clear tasks. | **Quantum Parallelism**: Because qubits are in superposition, a single quantum operation can **act on all possible inputs simultaneously**. This is the source of quantum speedup. |\\n| **Unique Phenomenon** | None | **Quantum Entanglement**: Two or more qubits can form a mysterious correlation. Regardless of distance, measuring one qubit instantly determines the state of the other(s). This allows a quantum computer to tightly link the states of different qubits for highly collaborative computation. |\\n\\n### 3. Performance and Applicable Domains\\n\\n| Feature | Traditional Computing | Quantum Computing |\\n| :--- | :--- | :--- |\\n| **Strong Suit** | - **General-purpose computing**: Office software, web browsing, games<br>- **Logic control**: Operating systems, application logic<br>- **Most data processing**: DMC, spreadsheets | - **Exponential speedup in specific domains**:<br> - **Cryptography**: Breaking encryption algorithms like RSA (Shor's algorithm)<br> - **Material simulation**: Precisely simulating the quantum properties of molecules and materials<br> - **Optimization problems**: Logistics route planning, financial portfolio optimization<br> - **Artificial intelligence**: Accelerating machine learning training |\\n| **Computational Complexity** | For certain complex problems (e.g., large number factorization), traditional algorithms require **exponentially** increasing time. | For specific problems, quantum algorithms can reduce complexity to the **polynomial** level, achieving \\"quantum supremacy.\\" |\\n| **Output Result** | Precise, deterministic results. | Typically **probabilistic** results. Because measurement is required, we obtain a potentially correct answer, so algorithms often need to run multiple times to increase confidence. |\\n\\n### 4. Physical Implementation and Challenges\\n\\n| Feature | Traditional Computer | Quantum Computer |\\n| :--- | :--- | :--- |\\n| **Hardware Foundation** | Based on **transistors** (semiconductors), mature technology, allowing for large-scale integration (e.g., CPUs with billions of transistors). | Requires physical systems that can maintain quantum states, such as superconducting circuits, ion traps, photonic qubits. Technology is still in its early stages. |\\n| **Main Challenge** | Power consumption, heat dissipation, transistor size approaching physical limits (Moore's Law slowing). | **Quantum Decoherence**: Quantum states are extremely fragile and easily lose their quantum properties due to environmental interference (e.g., heat, vibration). Requires extremely low temperatures (near absolute zero) and highly isolated environments. |\\n| **Error Correction** | Very low error rates, relatively simple error correction (e.g., parity check). | High error rates, requiring complex **quantum error-correcting codes** that use multiple physical qubits to encode one logical qubit, incurring significant overhead. |\\n\\n### Summary Table\\n\\n| Comparison Dimension | Traditional Computing | Quantum Computing |\\n| :--- | :--- | :--- |\\n| **Basic Unit** | Bit (0 or 1) | Qubit (Superposition: Superposition of 0 and 1) |\\n| **Operation Method** | Logic gates (serial) | Quantum gates (parallel) |\\n| **Core Principle** | Boolean logic | Superposition, entanglement, interference |\\n| **Result Output** | Deterministic | Probabilistic |\\n| **Strong Suit** | General tasks, logic control | Specific complex problems (e.g., simulation, optimization, cryptanalysis) |\\n| **Technology Maturity** | Very mature, widely used | Early stage, primarily used for research and specific computations |\\n| **Relationship with Users** | **Complementary Relationship**: Quantum computers are **not** intended to replace your phone or laptop. They function more like a **specialized accelerator** for solving specific, intractable problems that traditional computers cannot solve in the foreseeable future. In the future, we might access quantum computers via the cloud to handle the most complex parts, while traditional computers handle daily tasks and user interaction. |\\n\\nIn simple terms, a traditional computer is a \\"precise sharpshooter,\\" while a quantum computer is a \\"prophet capable of exploring all possibilities simultaneously.\\" Each has its strengths, and they will work together for a long time to come."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 32,
"completion_tokens": 1321,
"total_tokens": 1353,
"prompt_tokens_details": {
"cached_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0
}
}
}

Example: Function Calling (Tool Calling)

Note:
Replace YOUR_API_KEY with the API Key you created, and replace model with the service ID you want to experience. For tool calls in thinking mode, you must provide the historical reasoning_content in each request round to obtain the best results. For details, see Interleaved Thinking.
cURL
Python
Node.js
Java
Go
curl -X POST 'https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions' \\
-H 'Authorization: Bearer YOUR_API_KEY' \\
-H 'Content-Type: application/json' \\
-d '{
"model": "deepseek-v3.2",
"messages": [
{"role": "user", "content": "What is the weather like in Beijing today?"}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Obtain weather information for a specified city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name, such as: Beijing"}
},
"required": ["city"]
}
}
}
],
"tool_choice": "auto"
}'
from openai import OpenAI

client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://tokenhub-intl.tencentcloudmaas.com/v1",
)

tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Obtain weather information for a specified city",
"parameters": {
"type": "object",
"properties": {"city": {"type": "string", "description": "City name, such as: Beijing"}},
"required": ["city"],
},
},
}]

response = client.chat.completions.create(
model="deepseek-v3.2",
messages = [{"role": "user", "content": "What is the weather like in Beijing today?"}],
tools=tools,
tool_choice="auto",
)
print(response.choices[0].message)
import OpenAI from 'openai';

const client = new OpenAI({
apiKey: 'YOUR_API_KEY',
baseURL: 'https://tokenhub-intl.tencentcloudmaas.com/v1',
});

const tools = [{
type: 'function',
function: {
name: 'get_weather',
"description": "Obtain weather information for a specified city",
parameters: {
type: 'object',
"properties": {"city": {"type": "string", "description": "City name, such as: Beijing"}},
required: ['city'],
},
},
}];

const response = await client.chat.completions.create({
model: 'deepseek-v3.2',
messages: [{ role: 'user', content: 'What is the weather like in Beijing today?' }],
tools,
tool_choice: 'auto',
});
console.log(response.choices[0].message);
import okhttp3.*;
import com.google.gson.Gson;
import java.util.*;

public class FunctionCalling {
public static void main(String[] args) throws Exception {
Map<String, Object> tool = Map.of(
"type", "function",
"function", Map.of(
"name", "get_weather",
"description", "Obtain weather information for a specified city",
"parameters", Map.of(
"type", "object",
"properties", Map.of("city", Map.of("type", "string", "description", "City name, such as: Beijing")),
"required", List.of("city")
)
)
);

Map<String, Object> body = new HashMap<>();
body.put("model", "deepseek-v3.2");
body.put("messages", List.of(Map.of("role", "user", "content", "What is the weather like in Beijing today?")));
body.put("tools", List.of(tool));
body.put("tool_choice", "auto");

Request request = new Request.Builder()
.url("https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions")
.header("Authorization", "Bearer YOUR_API_KEY")
.post(RequestBody.create(new Gson().toJson(body), MediaType.parse("application/json")))
.build();

try (Response response = new OkHttpClient().newCall(request).execute()) {
System.out.println(response.body().string());
}
}
}
package main

import (
"bytes"
"encoding/json"
"fmt"
"io"
"net/http"
)

func main() {
tool := map[string]interface{}{
"type": "function",
"function": map[string]interface{}{
"name": "get_weather",
"description": "Obtain weather information for a specified city",
"parameters": map[string]interface{}{
"type": "object",
"properties": map[string]interface{}{
"city": map[string]interface{}{"type": "string", "description": "City name, such as: Beijing"},
},
"required": []string{"city"},
},
},
}

body, _ := json.Marshal(map[string]interface{}{
"model": "deepseek-v3.2",
"messages": []map[string]string{{"role": "user", "content": "What is the weather like in Beijing today?"}},
"tools": []map[string]interface{}{tool},
"tool_choice": "auto",
})

req, _ := http.NewRequest("POST",
"https://tokenhub-intl.tencentcloudmaas.com/v1/chat/completions",
bytes.NewBuffer(body))
req.Header.Set("Authorization", "Bearer YOUR_API_KEY")
req.Header.Set("Content-Type", "application/json")

resp, _ := http.DefaultClient.Do(req)
defer resp.Body.Close()
data, _ := io.ReadAll(resp.Body)
fmt.Println(string(data))
}
When the model decides to call a tool, it returns:
{
"choices": [
{
"message": {
"role": "assistant",
"content": null,
"tool_calls": [
{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\\"city\\": \\"Beijing\\"}"
}
}
]
},
"finish_reason": "tool_calls"
}
]
}
Return the tool execution result to the model and continue the conversation:
{
"model": "deepseek-v3.2",
"messages": [
{"role": "user", "content": "What is the weather like in Beijing today?"},
{"role": "assistant", "content": null, "tool_calls": [{"id": "call_abc123", "type": "function", "function": {"name": "get_weather", "arguments": "{\\"city\\": \\"Beijing\\"}"}}]},
{"role": "tool", "tool_call_id": "call_abc123", "content": "{\\"temperature\\": 22, \\"weather\\": \\"Sunny\\", \\"humidity\\": 45}"}
]
}

Anthropic API Usage

BaseUrl

Guangzhou: https://tokenhub.tencentcloudmaas.com
Singapore: https://tokenhub-intl.tencentcloudmaas.com

HTTP Headers

Field
Support Status
Description
anthropic-beta
Ignored.
Ignore the header.
anthropic-version
Ignored.
Ignore the header.
x-api-key
Fully supported
Used for authentication

Request Parameters

The following table lists the support status of the TokenHub gateway for the Anthropic protocol. For complete field definitions and the latest updates, see the Anthropic API official documentation.
Field
Support Status
Description
model
Supported
Replace with the Model(API Parameter) from the model list.
max_tokens
Fully supported
Maximum number of output tokens
container
Ignored.
Ignore this field
mcp_servers
Ignored.
Ignore this field
metadata
Ignored.
Ignore this field
service_tier
Ignored.
Ignore this field
stop_sequences
Fully supported
Stop sequences
stream
Fully supported
Streaming response
system
Fully supported
System message
temperature
Fully supported
Temperature parameter (0.0-2.0)
thinking
Ignored.
Ignore this field
top_k
Ignored.
Ignore this field
top_p
Fully supported
Top-p sampling

Tool Support

tools

Field
Support Status
Description
name
Fully supported
Tool Name
input_schema
Fully supported
Input parameter schema
description
Fully supported
Tool description
cache_control
Ignored.
Ignore this field
tool_choice
String format
Fully supported
tool_choice
Object format
Fully supported
tool_choice.disable_parallel_tool_use
Ignored.
Ignore this field

tool_choice

Field
Support Status
none
Fully supported
auto
Fully supported
any
Fully supported
tool
Fully supported
disable_parallel_tool_use
Ignored.

Message Field Support

Field Type
Variant
Subfield
Support Status
content
string
-
Fully supported
content
array, type="text"
text
Fully supported
content
array, type="text"
cache_control
Ignored.
content
array, type="text"
citations
Ignored.
content
array, type="image"
-
Supported by some models.
For details, refer to the invocation guide for each model.
content
array, type="document"
-
Not supported.
content
array, type="search_result"
-
Not supported.
content
array, type="thinking"
-
Ignored.
content
array, type="redacted_thinking"
-
Not supported.
content
array, type="tool_use"
id
Fully supported
content
array, type="tool_use"
input
Fully supported
content
array, type="tool_use"
name
Fully supported
content
array, type="tool_use"
cache_control
Ignored.
content
array, type="tool_result"
tool_use_id
Fully supported
content
array, type="tool_result"
content
Fully supported
content
array, type="tool_result"
cache_control
Ignored.
content
array, type="tool_result"
is_error
Ignored.
Note:
1. Ignored Fields: Certain Anthropic-specific fields are ignored, but no error is reported.
2. Parallel Tool Calls: The disable_parallel_tool_use parameter is ignored.
3. Cache Control: All cache_control-related fields are ignored.

Sample Code

Note:
Replace YOUR_API_KEY with the API Key you created, and replace model with the service ID you want to try.
cURL
Python
Node.js
Java
Go
curl https://tokenhub-intl.tencentcloudmaas.com/v1/messages \\
-H "Content-Type: application/json" \\
-H "x-api-key: YOUR_API_KEY" \\
-d '{
"model": "deepseek-v3.2",
"max_tokens": 1000,
"stream": true,
"system": [
{"type": "text", "text": "You are a helpful assistant."}
],
"messages": [
{"role": "user", "content": [{"type": "text", "text": "Hi, how are you?"}]}
]
}'
import anthropic

client = anthropic.Anthropic(
api_key="YOUR_API_KEY",
base_url="https://tokenhub-intl.tencentcloudmaas.com",
)

with client.messages.stream(
model="deepseek-v3.2",
max_tokens=1000,
system="You are a helpful assistant.",
messages=[{"role": "user", "content": "Hi, how are you?"}],
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
apiKey: 'YOUR_API_KEY',
baseURL: 'https://tokenhub-intl.tencentcloudmaas.com',
});

const stream = await client.messages.stream({
model: 'deepseek-v3.2',
max_tokens: 1000,
system: 'You are a helpful assistant.',
messages: [{ role: 'user', content: 'Hi, how are you?' }],
});

for await (const event of stream) {
if (event.type === 'content_block_delta' && event.delta.type === 'text_delta') {
process.stdout.write(event.delta.text);
}
}
import okhttp3.*;
import okhttp3.sse.*;
import com.google.gson.Gson;
import java.util.*;

public class AnthropicCall {
public static void main(String[] args) {
Map<String, Object> body = new HashMap<>();
body.put("model", "deepseek-v3.2");
body.put("max_tokens", 1000);
body.put("stream", true);
body.put("system", List.of(Map.of("type", "text", "text", "You are a helpful assistant.")));
body.put("messages", List.of(Map.of(
"role", "user",
"content", List.of(Map.of("type", "text", "text", "Hi, how are you?"))
)));

Request request = new Request.Builder()
.url("https://tokenhub-intl.tencentcloudmaas.com/v1/messages")
.header("x-api-key", "YOUR_API_KEY")
.header("Content-Type", "application/json")
.post(RequestBody.create(new Gson().toJson(body), MediaType.parse("application/json")))
.build();

EventSources.createFactory(new OkHttpClient()).newEventSource(request,
new EventSourceListener() {
@Override public void onEvent(EventSource es, String id, String type, String data) {
System.out.println(data);
}
});
}
}
package main

import (
"bufio"
"bytes"
"encoding/json"
"fmt"
"net/http"
"strings"
)

func main() {
body, _ := json.Marshal(map[string]interface{}{
"model": "deepseek-v3.2",
"max_tokens": 1000,
"stream": true,
"system": []map[string]string{
{"type": "text", "text": "You are a helpful assistant."},
},
"messages": []map[string]interface{}{
{
"role": "user",
"content": []map[string]string{
{"type": "text", "text": "Hi, how are you?"},
},
},
},
})

req, _ := http.NewRequest("POST",
"https://tokenhub-intl.tencentcloudmaas.com/v1/messages",
bytes.NewBuffer(body))
req.Header.Set("x-api-key", "YOUR_API_KEY")
req.Header.Set("Content-Type", "application/json")

resp, _ := http.DefaultClient.Do(req)
defer resp.Body.Close()

scanner := bufio.NewScanner(resp.Body)
for scanner.Scan() {
line := scanner.Text()
if strings.HasPrefix(line, "data: ") {
fmt.Println(strings.TrimPrefix(line, "data: "))
}
}
}
Response:
data: {"content_block":{"text":"","type":"text"},"index":1,"type":"content_block_start"}

event: content_block_delta
data: {"delta":{"text":"Hey","type":"text_delta"},"index":0,"type":"content_block_delta"}

event: content_block_delta
data: {"delta":{"text":"! I'm doing well, thanks for asking! I'm","type":"text_delta"},"index":0,"type":"content_block_delta"}

event: content_block_delta
data: {"delta":{"text":" here and ready to help with whatever you need.","type":"text_delta"},"index":0,"type":"content_block_delta"}

event: content_block_delta
data: {"delta":{"text":" How are you doing today? Is there something I","type":"text_delta"},"index":0,"type":"content_block_delta"}

event: content_block_delta
data: {"delta":{"text":" can assist you with?","type":"text_delta"},"index":0,"type":"content_block_delta"}

event: content_block_stop
data: {"index":1,"type":"content_block_stop"}

event: message_delta
data: {"delta":{"stop_reason":"end_turn","stop_sequence":null},"type":"message_delta","usage":{"output_tokens":57}}

event: message_stop
data: {"type":"message_stop"}

Integrating the Model with Claude Code

Installing Claude Code

To install or update Anthropic Claude Code, run the following command:
npm install -g @anthropic-ai/claude-code

Configuring Environment Variables

export ANTHROPIC_BASE_URL=https://tokenhub-intl.tencentcloudmaas.com
export ANTHROPIC_AUTH_TOKEN=${API_KEY}
export API_TIMEOUT_MS=600000
export ANTHROPIC_MODEL=${MODEL_NAME}
export CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
Note:
API_TIMEOUT_MS is configured to prevent the Claude Code client from timing out due to lengthy outputs. The timeout duration set here is 10 minutes, and users can adjust it as needed.

Executing the claude Command

Go to the project directory, run the claude command, and you can start using it.
cd my-project
claude


도움말 및 지원

문제 해결에 도움이 되었나요?

피드백