tencent cloud

Tencent Cloud AI Digital Human

Product Overview
Overview
Product Features
Product Advantage
Purchase Guide
Pricing Guide
Purchase Guide
Process for Purchasing with Vouchers
Refund Instructions
Introduction of Avatar
Introduction to Image Categories
Basic Image Library
Guide on Avatar and Voice Clone
Digital Human Platform Operation Guide
Accessing Platform
Avatar Production and Asset Management
Digital Human Conversation Interaction Application and Management
Broadcast Digital Human Video Generation and Management
Operations Management and Analysis
Server API Integration
Digital Human API Access Mode Overview
Avatar aPaas API Calling Methods
Avatar Image Customization and Voice Clone API Documentation
Video Generation Service API Documentation
Interactive Digital Human Service API Documentation
Personal Asset Management API Documentation
Client SDK Integration
Overall Introduction
3D Client-Side Rendering SDK Integration
2D Client-Side Rendering SDK Integration
Digital Human SSML Markup Language Specification
Related Agreement
Privacy Policy
DSA (Data Sharing Agreement)
FAQs

Instruction Sending Requirements

PDF
Focus Mode
Font Size
Last updated: 2024-07-19 10:03:07
Note:
Based on a persistent connection, instructions are sent to drive the AI digital human.
 Persistent connection instructions are divided into sending text instructions, sending audio instructions, and sending heartbeat instructions.
 Sending a text instruction involves sending broadcast text to the cloud, which synthesizes speech to drive the AI digital human to speak;
 Sending an audio instruction involves sending voice stream shards to the cloud, which then uses the original sound or voice changer to drive the AI digital human to speak;
 Sending a heartbeat instruction is to maintain a persistent connection and prevent session disconnection. If the client does not send a heartbeat and there is no valid data exchanged, the cloud will automatically disconnect the persistent connection after 3 minutes and close the session stream after 10 minutes.
The relationship between the DriverType parameter and the instruction types when creating a new stream is as follows:
DriverType 1: Text-driven; supporting sending text instructions (SEND_TEXT) and heartbeat instructions (SEND_HEARTBEAT).
DriverType 3: Audio-driven; supporting sending text instructions (SEND_TEXT), audio instructions (SEND_AUDIO), and heartbeat instructions (SEND_HEARTBEAT).
1. When a text is being broadcast, if you want to send audio, you need to send an interrupt with an empty text instruction until you receive a TextOver event, after which you can send the audio.
2. When the text is being broadcast and you want to broadcast new text, directly send the new text instruction. The current text broadcast will be interrupted, and upon completion, you will receive a TextOver event. The broadcast of the new text will start with a TextStart event and end with a TextOver event.
3. When an audio-driven session is in progress, if you want to interrupt the audio to send a text instruction, you need to send an audio instruction with IsFinal set to true to end the audio session. Only after receiving an AudioOver event can you send text or audio.


Help and Support

Was this page helpful?

Help us improve! Rate your documentation experience in 5 mins.

Feedback