tencent cloud

Tencent Real-Time Communication

AI Speech to Text and Translation

PDF
フォーカスモード
フォントサイズ
最終更新日: 2026-05-06 14:57:52
This document explains how to quickly integrate AI real-time transcription (speech-to-text) and translation features on the client side using the AITranscriberManager interface in the TRTC SDK.

Solution Overview

TRTC's AI real-time transcription and translation features let you convert audio streams in a room to text instantly and translate them into multiple target languages. With the SDK's AITranscriberManager, you can start transcription tasks, receive recognition results, and manage the process directly on the client. Unlike server-side integration, using the SDK removes the need to build your own backend for cloud API calls, streamlining your development workflow.

Prerequisites

Log in to the TRTC console, activate the TRTC service, and create an RTC-Engine application.
Purchasing RTC-Engine package (Lite version or above) unlocks the speech to text and real-time translation features.
Note:
The speech-to-text and real-time translation features are billed based on usage. For details, see Pricing.

Integration Process

Step 1: Integrate the TRTC SDK

Add the TRTC SDK to your project, join a TRTC room, and enable local microphone audio capture and publishing. iOS, Android, Windows, macOS And Web clients currently support direct transcription and translation task initiation. See the following integration guides to import the SDK into your project:
Note:
After importing the SDK, continue with the steps below.

Step 2: Obtain an AITranscriberManager Instance

AITranscriberManager is the main class for managing AI transcription features. Retrieve its instance from TRTCCloud.
Android
iOS&macOS
Windows
import com.tencent.liteav.transcriber.AITranscriberManager;

TRTCCloud mTRTCCloud = TRTCCloud.sharedInstance(context);
AITranscriberManager aiTranscriberManager = mTRTCCloud.getAITranscriberManager();
TRTCCloud *trtcCloud = [TRTCCloud sharedInstance];
AITranscriberManager *manager = [trtcCloud getAITranscriberManager];
liteav::ITRTCCloud* trtcCloud = liteav::ITRTCCloud::getTRTCShareInstance();
liteav::AITranscriberManager* manager = trtcCloud->getAITranscriberManager();

Step 3: Set Up Event Listeners

Set up a listener to receive transcription status updates, real-time transcription and translation messages, and error notifications for users participating in transcription within the room.
Android
iOS&macOS
Windows
AITranscriberManager.AITranscriberListener listener = new AITranscriberManager.AITranscriberListener() {
@Override
public void onRealtimeTranscriberStarted(String roomId, String transcriberRobotId) {
// Transcription started
}

@Override
public void onReceiveTranscriberMessage(String roomId, AITranscriberManager.TranscriberMessage message) {
// Handle real-time transcription and translation messages
}

@Override
public void onRealtimeTranscriberStopped(String roomId, String transcriberRobotId, int reason) {
// Transcription stopped
}

@Override
public void onRealtimeTranscriberError(String roomId, String transcriberRobotId, int error, String errorInfo) {
// Handle real-time transcription service errors
}
};

aiTranscriberManager.addListener(listener);
- (void)onRealtimeTranscriberStarted:(NSString *)roomId transcriberRobotId:(NSString *)transcriberRobotId {
// Transcription started
}

- (void)onReceiveTranscriberMessage:(NSString *)roomId message:(TranscriberMessage *)message {
// Handle real-time transcription and translation messages
}

- (void)onRealtimeTranscriberStopped:(NSString *)roomId transcriberRobotId:(NSString *)transcriberRobotId reason:(NSInteger)reason {
// Transcription stopped
}

- (void)onRealtimeTranscriberError:(NSString *)roomId transcriberRobotId:(NSString *)transcriberRobotId error:(NSInteger)error errorInfo:(NSString *)errorInfo {
// Handle real-time transcription service errors
}

[manager addListener:self];
class MyTranscriberListener : public liteav::AITranscriberListener {
public:
void onRealtimeTranscriberStarted(const char* roomId, const char* transcriberRobotId) override {
// Transcription started
}

void onReceiveTranscriberMessage(const char* roomId, const liteav::TranscriberMessage& message) override {
// Handle real-time transcription and translation messages
}

void onRealtimeTranscriberStopped(const char* roomId, const char* transcriberRobotId, int reason) override {
// Transcription stopped
}
void onRealtimeTranscriberError(const char* roomId, const char* transcriberRobotId, int error, const char* errorInfo) override {
// Handle real-time transcription service errors
}
};

MyTranscriberListener* listener = new MyTranscriberListener();
manager->addListener(listener);

TranscriberMessage Details

Field Name
Type
Description
segmentId
String
Unique ID for the message segment. Used for deduplication or sorting.
speakerUserId
String
ID of the speaking user.
sourceText
String
Recognized source language text (Unicode encoded).
translationTexts
Map/List
Translated target language text.
timestamp
long
UTC timestamp when the message was generated, in milliseconds.
isCompleted
bool
Indicates whether transcription is complete.
true: The sentence is finished, final result.
false: The sentence is ongoing, intermediate result (streaming update).

Step 4: Start a Transcription Task

Create a TranscriberParams object, set the transcriber robot ID, source language, target translation languages, and other parameters. Then call startRealtimeTranscriber to begin the transcription service.
Android
iOS&macOS
Windows
AITranscriberManager.TranscriberParams params = new AITranscriberManager.TranscriberParams();
params.transcriberRobotId = "my_robot"; // Optional: Specify robot ID
params.sourceLanguage = "en"; // Source language: English
params.translationLanguages = Arrays.asList("zh", "ja"); // Optional: If not set, only transcription is performed, no translation
params.userIdsToTranscribe = Arrays.asList("userA"); // Optional: If not set, transcribe all users in the room

aiTranscriberManager.startRealtimeTranscriber(params);
TranscriberParams *params = [[TranscriberParams alloc] init];
params.transcriberRobotId = @"my_robot"; // Optional: Specify robot ID
params.sourceLanguage = @"en"; // Source language: English
params.translationLanguages = @[@"zh", @"ja"]; // Optional: If not set, only transcription is performed, no translation
params.userIdsToTranscribe = @[@"userA"]; // Optional: If not set, transcribe all users in the room

[manager startRealtimeTranscriber:params];
liteav::TranscriberParams params;
params.transcriberRobotId = "my_robot"; // Optional: Specify robot ID
params.sourceLanguage = "en"; // Source language: English
const char* targetLangs[] = {"zh", "ja"};
params.translationLanguages = targetLangs; // Optional: If not set, only transcription is performed, no translation
params.translationLanguagesCount = 2;
const char* transcribeUsers[] = {"userA"};
params.userIdsToTranscribe = transcribeUsers; // Optional: If not set, transcribe all users in the room
params.userIdsToTranscribeCount = 1;

manager->startRealtimeTranscriber(params);

TranscriberParams Details

Parameter Field
Type
Required
Description
transcriberRobotId
String
No
Unique ID for the transcription robot.
For a single transcription task, if not specified, the SDK generates a default ID in the format transcriber_${roomid}_robot_${userid}.
If you start multiple transcription tasks at the same time, you must specify the robot ID.
sourceLanguage
String
Yes
Source language code.
Specify the language type of the source audio. Use the standard language code (e.g., "en").
translationLanguages
List/Array
No
List of target language codes for translation.
If translation is required, set the target language codes here (e.g., "zh").
userIdsToTranscribe
List/Array
No
List of user IDs to transcribe.
If not set, audio from all users in the room will be transcribed by default.
The SDK sends the result of this interface via webhook:
If the call succeeds, you'll receive the onRealtimeTranscriberStarted webhook, indicating the transcription task started successfully. You can then receive real-time transcription and translation messages through the onReceiveTranscriberMessage webhook.
If the call fails, you'll receive the onRealtimeTranscriberError webhook, indicating the transcription task failed to start. Take action based on the specific error code (see Server Error Codes).

Step 5: Stop the Transcription Task

When transcription is no longer needed, call stopRealtimeTranscriber to end the task and release resources. Pass the robot ID used to start the task. (If you didn't specify a robot ID at start, a default one is generated; passing an empty value will stop the robot task.)
Android
iOS&macOS
Windows
aiTranscriberManager.stopRealtimeTranscriber("my_robot");
[manager stopRealtimeTranscriber:@"my_robot"];
manager->stopRealtimeTranscriber("my_robot");
The SDK sends the result of this interface via webhook:
If the call succeeds, you'll receive the onRealtimeTranscriberStopped webhook, indicating the transcription task stopped successfully. You will no longer receive new transcription messages.
If the call fails, you'll receive the onRealtimeTranscriberError webhook, indicating the transcription task failed to stop. Take action based on the specific error code (see Server Error Codes).

Supported Language Codes

Source Language

Language Code
Language Name
zh
Chinese
en
English
vi
Vietnamese
ja
Japanese
ko
Korean
id
Indonesian
th
Thai
pt
Portuguese
tr
Turkish
ar
Arabic
es
Spanish
hi
Hindi
fr
French
ms
Malay
fil
Filipino
de
German
it
Italian
ru
Russian
sv
Swedish
da
Danish
no
Norwegian
Note:
Client-initiated real-time transcription currently supports 21 languages: Chinese, English, Vietnamese, Japanese, Korean, Indonesian, Thai, Portuguese, Turkish, Arabic, Spanish, Hindi, French, Malay, Filipino, German, Italian, Russian, Swedish, Danish, Norwegian. For support for additional languages, please contact us.
For Chinese and English, client-side transcription via AITranscriberManager uses the latest 16k_zh_en large model in the Standard Edition language engine by default. See Billing of Speech AI Service.

Target Translation Language

Language Code
Language Name
zh
Chinese
en
English
es
Spanish
pt
Portuguese
fr
French
de
German
ru
Russian
ar
Arabic
ja
Japanese
ko
Korean
vi
Vietnamese
ms
Malay
id
Indonesian
it
Italian
th
Thai
Note:
Real-time translation currently supports 15 languages for input and output: Chinese, English, Spanish, Portuguese, French, German, Russian, Arabic, Japanese, Korean, Vietnamese, Malay, Indonesian, Italian, Thai. If the preceding ASR transcription language is not one of these, translation cannot be enabled. For additional language support, please contact us.
AI translation results are provided for reference only and should not be considered professional advice or conclusions.

Server Error Codes

Error Code
Meaning
Recommended Action
2000
Parameter error.
Check whether the request parameters are valid.
2002
Task does not exist.
If returned when calling the stop interface, can be ignored.
2026
Transcription service (ASR/Translation service) not enabled.
Enable the relevant service in the console.
3000
Internal error.
Retry the operation.
4003
Task is exiting.
If returned when calling the stop interface, can be ignored.
5000
Resource overload.
Use a backoff strategy and retry.
5001
Concurrency limit.
Contact the product team to increase concurrency limits.
-102009
Host is not in the room.
Check host status and retry after confirmation.
-102005
Room does not exist.
Check room status and retry after confirmation.


ヘルプとサポート

この記事はお役に立ちましたか?

フィードバック