Online Inference

Download

フォーカスモード

フォントサイズ

最終更新日: 2026-06-11 17:48:40

Feature Overview
The online inference service manages how models are used, such as free quota usage, whether pay-per-token billing is enabled, security policies, and rate limiting policies, and so on.
Service Types
Online inference services are categorized into two types:
1. Default
The platform automatically creates an online inference service for all supported models by default. To get started quickly, users can claim a free trial package on the Model Hub or the model details page, or click Free Trial in the service list on the Online Inference page.
2. Custom
If you need to customize the billing policy for model services or create multiple services to track usage statistics and manage permissions by team, you can create custom inference services in Online Inference.
Custom inference services support a wider range of billing options, such as TPM Reservation. In the future, the platform will further support capabilities like intelligent routing, rate limiting rules, and plug-in enablement/disablement in custom services, helping you achieve more flexible service management and governance.
﻿
Service Status
Each online inference service has a status, as described below:
Status
Description
Not Enabled
For the default type of inference service, it is in the Not Enabled state before a user starts using it and changes to the Running state after the user begins the free trial.
Activating
When the service is enabled for the first time, it enters a brief Activating state and is expected to change to the Running state within 5 seconds.
Running
The current service is accessible.
Stopped
1. When the account has overdue payment, pay-as-you-go services will become Stopped; when the account balance is replenished, the service will automatically return to Running.
2. When the free quota is exhausted and postpaid is not enabled, or when a user manually disables postpaid for a service, the service will become Stopped. To resume the service, the user must manually enable postpaid on the Online Inference page.
Billing Mode
The billing method indicates the payment status of the current service, as described below:
Status
Description
Free Trial
The current service is using a free trial package. Usage within the free trial package is not billed.
Pay-as-you-go
The current service has activated the postpaid billing method based on Token usage.
TPM Reservation
The current service has TPM Reservation enabled. Traffic exceeding the TPM limit will be billed by Token.
None
When the user's free trial package is exhausted and postpaid billing is not enabled, there will be no billing status, and the service will become stopped.
Note:
Users can enable postpaid in advance even before the free trial package is fully consumed. After postpaid is enabled, both the Free Trial and Pay-per-Token billing statuses are displayed. The platform prioritizes consuming the free trial quota. Once the free trial package is exhausted, billing based on Token usage will begin.
﻿
﻿
﻿
﻿
﻿

ヘルプとサポート

この記事はお役に立ちましたか？

営業担当者にお問い合わせいただくかチケットを提出してサポートを求めることができます。

フィードバック

tencent cloud

LLM Service TokenHub

Online Inference

Feature Overview

Service Types

1. Default

2. Custom

Service Status

Billing Mode

ヘルプとサポート

Status	Description
Not Enabled	For the default type of inference service, it is in the Not Enabled state before a user starts using it and changes to the Running state after the user begins the free trial.
Activating	When the service is enabled for the first time, it enters a brief Activating state and is expected to change to the Running state within 5 seconds.
Running	The current service is accessible.
Stopped	1. When the account has overdue payment, pay-as-you-go services will become Stopped; when the account balance is replenished, the service will automatically return to Running. 2. When the free quota is exhausted and postpaid is not enabled, or when a user manually disables postpaid for a service, the service will become Stopped. To resume the service, the user must manually enable postpaid on the Online Inference page.

Status	Description
Free Trial	The current service is using a free trial package. Usage within the free trial package is not billed.
Pay-as-you-go	The current service has activated the postpaid billing method based on Token usage.
TPM Reservation	The current service has TPM Reservation enabled. Traffic exceeding the TPM limit will be billed by Token.
None	When the user's free trial package is exhausted and postpaid billing is not enabled, there will be no billing status, and the service will become stopped.