TokenHub supports which models and how they are billed?
Where can I view my usage and billing details?
You can view usage by model, service, and API key dimensions on the Usage Statistics page in the TokenHub console of the large model service platform, or view billing details in the Tencent Cloud Cost Center. What is online inference service, and why should I invoke models through a service?
The online inference service is used to manage the usage of models, including billing methods, rate limiting mechanisms, and so on. Multiple online inference services can be created for the same model to accommodate different business scenarios. Therefore, even for the same model, multiple services may exist, and access must be specified by service ID.
How to access model services?
The platform supports OpenAI API protocol calls. You can refer to the sample code on the details page of each model to complete the invocation.
Is there rate limiting for accessing model services?
Yes, the rate limiting presets may vary for each model. You can view the rate limiting rules for each model on the model details page.
Are the multiple services created from the same model billed independently?
Yes, enabling or disabling billing for services and the billing methods do not affect each other.