Custom Capturing and Rendering

Last updated: 2021-09-17 12:08:09

    Overview

    • Custom video capturing
      If you develop your own beauty filter and special effect processing modules or purchase from third parties, you need to capture and process camera data by yourself. You can call the enableCustomVideoCapture API of TRTCCloud to disable the camera capturing and image processing logic of the TRTC SDK, and use the sendCustomVideoData API to feed your own video data to the TRTC SDK.
    • Custom video rendering
      The TRTC SDK uses OpenGL to render video images. If you use the SDK for game development or want to integrate it into your own UI engine, you must render video images by yourself.
    • Custom audio capturing
      If you use the TRTC SDK on a special device, to capture your own audio data via an external device, you can call the enableCustomAudioCapture API of TRTCCloud to disable the TRTC SDK’s default audio capturing process, and use the sendCustomAudioData API to feed your own audio data to the TRTC SDK.
      Note:

      Enabling custom audio capturing may cause the acoustic echo cancellation (AEC) feature to fail.

    • Getting raw audio data
      The audio module is a highly complex module, and the TRTC SDK needs to strictly control the capturing and playback logic of audio devices. In some cases, you can use the callback APIs of the SDK to get the audio data of a remote user or that captured by the local mic.

    Supported Platforms

    iOS Android macOS Windows web
    ×

    Custom Video Capturing

    You can call the enableCustomVideoCapture API of TRTCCloud to disable the TRTC SDK's camera capturing and image processing logic, and use the sendCustomVideoData API to feed your own video data to the TRTC SDK.

    The sendCustomVideoData API includes a parameter named TRTCVideoFrame, which represents a video frame. To avoid performance loss, the TRTC SDK has requirements on the format of video data it receives, which vary with the platform used.

    On iOS, the TRTC SDK supports data in two YUV formats: NV12 and I420. Image transferring via CVPixelBufferRef delivers higher performance on iOS. Given this, we recommend the following settings.

    ParameterTypeRecommended ValueNote
    pixelFormatTRTCVideoPixelFormatTRTCVideoPixelFormat_NV12NV12 is the format of the original video data captured by an iOS device.
    bufferTypeTRTCVideoBufferTypePixelBufferThis is the video frame format supported by iOS, and it delivers the best performance.
    pixelBufferCVPixelBufferRefRequired if TRTCVideoBufferType is PixelBuffer.The data captured by iPhone’s camera is NV12 formatted PixelBuffer.
    dataNSData*Required if TRTCVideoBufferType is NSDataIts performance is inferior to that of PixelBuffer.
    timestampuint64_t0It can be 0, in which case the SDK will fill the timestamp field automatically, but please make sure that sendCustomVideoData is called at regular intervals.
    widthuint64_tWidth of the video imageSet this parameter to the pixel width of the video passed in.
    heightuint32_tHeight of the video imageSet this parameter to the pixel height of the video passed in.
    rotationTRTCVideoRotationLeave it empty
    • It is left empty by default.
    • If you want to rotate the video, set it to TRTCVideoRotation_0, TRTCVideoRotation_90, TRTCVideoRotation_180, or TRTCVideoRotation_270. The SDK will rotate the video clockwise by the number of degrees set. For example, if TRTCVideoRotation_90 is passed in, an image in portrait mode will switch to landscape mode after rotation.

    Sample code

    The LocalVideoShareViewController.m file in the demo folder demonstrates how to extract NV12-formatted pixel buffers from a local file and process the data using the SDK.

    // Assemble a `TRTCVideoFrame` and send it to a `trtcCloud` object.
    TRTCVideoFrame* videoFrame = [TRTCVideoFrame new];
    videoFrame.bufferType = TRTCVideoBufferType_PixelBuffer;
    videoFrame.pixelFormat = TRTCVideoPixelFormat_NV12;
    videoFrame.pixelBuffer = imageBuffer;
    videoFrame.rotation = rotation;
    videoFrame.timestamp = timeStamp;
    
    

    [trtcCloud sendCustomVideoData:videoFrame];

    Custom Video Rendering

    The TRTC SDK uses OpenGL to render video images. If you use the SDK for game development or want to integrate it into your own UI engine, you must render video images by yourself.

    You can call setLocalVideoRenderDelegate and setRemoteVideoRenderDelegate of TRTCCloud to configure callbacks for the custom rendering of local and remote video images. Below are the relevant parameters.

    ParameterTypeRecommended ValueNote
    pixelFormatTRTCVideoPixelFormatTRTCVideoPixelFormat_NV12-
    bufferTypeTRTCVideoBufferTypeTRTCVideoBufferType_PixelBufferThis is the video frame format supported by iOS, and it delivers the best performance.

    Sample code

    If you set pixelFormat to TRTCVideoPixelFormat_NV12 and bufferType to TRTCVideoBufferType_PixelBuffer, it would be easy to convert a frame of NV12 formatted PixelBuffer into a video image. The TestRenderVideoFrame.m file in the demo folder demonstrates how this works:

    - (void)onRenderVideoFrame:(TRTCVideoFrame *)frame 
                                  userId:(NSString *)userId 
                            streamType:(TRTCVideoStreamType)streamType
    {
      // If `userId` is `nil`, the image rendered is the local image; otherwise it is a remote image.
      CFRetain(frame.pixelBuffer);
      __weak __typeof(self) weakSelf = self;
      dispatch_async(dispatch_get_main_queue(), ^{
          TestRenderVideoFrame* strongSelf = weakSelf;
          UIImageView* videoView = nil;
          if (userId) {
              videoView = [strongSelf.userVideoViews objectForKey:userId];
          }
          else {
              videoView = strongSelf.localVideoView;
          }
          videoView.image = [UIImage imageWithCIImage:[CIImage imageWithCVImageBuffer:frame.pixelBuffer]];
          videoView.contentMode = UIViewContentModeScaleAspectFit;
          CFRelease(frame.pixelBuffer);
      });
    }
    

    Custom Audio Capturing

    You can call the enableCustomAudioCapture API of TRTCCloud to disable the TRTC SDK's default audio data capturing process, and use the sendCustomAudioData API to feed your own audio data to the TRTC SDK.

    The sendCustomAudioData API includes a parameter named TRTCAudioFrame, which represents a 20 ms audio frame.

    • The data sent to the SDK through sendCustomAudioData must be uncompressed raw audio data in PCM format. AAC or other compressed formats are not supported.
    • sampleRate and channels represent the audio sample rate and number of sound channels respectively, which should be consistent with the PCM data passed in.
    • The recommended duration of each audio frame is 20 ms. Suppose sampleRate is 48,000, and channels is 1 (mono). The byte length of the buffer passed in each time sendCustomAudioData is called would be 48,000 x 0.02s x 1 x 16 bits = 15,360 bits = 1,920 bytes.
    • timestamp can be 0, in which case the SDK will fill the field automatically. To ensure audio stability and avoid choppy audio, please make sure that sendCustomAudioData is called at regular intervals, preferably every 20 ms.
    Note:

    Using sendCustomAudioData may cause AEC to fail.

    Getting Raw Audio Data

    The audio module is a highly complex module, and the TRTC SDK needs to strictly control the capturing and playback logic of audio devices. In some cases, to get the audio data of a remote user or that captured by the local mic, you can use the APIs of TRTCCloud for different platforms to integrate the following callback APIs into the SDK.

    APIs for different platforms:
    • iOS: setAudioFrameDelegate
    • Android: setAudioFrameListener
    • Windows: setAudioFrameCallback
    API Description
    onCapturedAudioFrame Gets the raw audio data captured by the local mic. In the non-custom capturing mode, the SDK is responsible for capturing audio, but you may want to get the raw audio data, which can be achieved using this callback API.
    onPlayAudioFrame Calls back the audio data of each remote user, which is the data before audio mixing. You can use this callback if you want to perform speech recognition on a specific channel of audio.
    onMixedPlayAudioFrame Calls back mixed audio data before it is fed into the speaker for playback.
    Note:

    • Do not perform time-consuming operations with any of the above callback APIs. We recommend that you copy them to another thread to avoid AEC failure and choppy audio.
    • The data called back by the above callback APIs should only be read and copied. Modifications may lead to unexpected outcomes.