tencent cloud

Feedback

Video Content Recognition

Last updated: 2021-12-31 11:55:58

    Video content recognition is an offline task that intelligently recognizes video content including faces, text, opening and closing credits, and speech with the aid of AI. See the table below for details:

    Feature Description Use Case
    Face recognition Recognizes faces in video image
  • Marks where celebrities appear in video image
  • Checks for particular people in video image
  • Full speech recognition Recognizes all phrases in speech
  • Generates subtitles for speech content
  • Performs data analysis on video speech content
  • Full text recognition Recognizes all text in video image Performs data analysis on text in video image
    Speech keyword recognition Recognizes keywords in speech
  • Checks for sensitive words in speech
  • Retrieves specific keywords in speech
  • Text keyword recognition Recognizes keywords in video image
  • Checks for sensitive words in video image
  • Retrieves specific keywords in video image
  • Opening and closing credits recognition Recognizes opening and closing credits in video
  • Marks the positions of opening credits, closing credits, and feature in the progress bar
  • Removes opening and closing credits of videos in batches
  • Some content recognition features depend on a material library. There are two types of libraries: public library and custom library.

    • Public library: VOD's preset material library.
    • Custom library: a material library created and managed by user.
    Recognition Type Public Library Custom Library
    Face recognition Supported. The library includes celebrities in sports and the entertainment industry and other people. Supported. Call a server API to manage the custom face library
    Speech recognition Not supported yet Supported. Call a server API to manage the keyword library
    Text recognition Not supported yet Supported. Call a server API to manage the keyword library

    Video Content Recognition Template

    Video content recognition integrates a number of recognition features that require fine-grained control through parameters as described below:

    • Recognition types to enable: which content recognition features to enable
    • Library to use: whether to use the public library or a custom library for face recognition
    • Filter score: the confidence score threshold to return face recognition results
    • Filter tag: tags to filter the returned results by

    VOD provides preset video content recognition templates for common parameter combinations. You can also use a server API to create and manage custom templates.

    Initiating a Task

    You can initiate a video content recognition task by calling a server API, via the console, or by specifying the task when uploading videos. For details, see Task Initiation.

    Below are instructions for initiating video content recognition tasks in these ways:

    • Call the server API ProcessMedia to initiate a task: specify the video content recognition template ID in the AiRecognitionTask parameter in the request.
    • Call the server API ProcessMediaByUrl to initiate a task: specify the video content recognition template ID in the AiRecognitionTask parameter in the request.
    • Initiate a task via the console: call a server API to create a video content recognition task flow (by specifying MediaProcessTask.AiRecognitionTask), and use it to process videos in the console.
    • Specify a task upon upload from server: call a server API to create a video content recognition task flow (by specifying MediaProcessTask.AiRecognitionTask), and set procedure to the name of the task flow when calling ApplyUpload.
    • Specify a task upon upload from client: call a server API to create a video content recognition task flow (by specifying MediaProcessTask.AiRecognitionTask), and set procedure to the name of the task flow in the signature for upload from client.
    • Specify a task upon upload from the console: call a server API to create a video content recognition task flow (by specifying MediaProcessTask.AiRecognitionTask) and, when uploading videos via the console, choose Automatic Processing After Upload and select the task flow created.

    Obtaining Result

    After initiating a video content recognition task, you can wait for result notification asynchronously or perform task query synchronously to get the task execution result. Below is an example of getting the result notification in normal callback mode after the content recognition task is initiated (the fields with null value are omitted):

    {
      "EventType":"ProcedureStateChanged",
      "ProcedureStateChangeEvent":{
          "TaskId":"1400155958-Procedure-2e1af2456351812be963e309cc133403t0",
          "Status":"FINISH",
          "FileId":"5285890784363430543",
          "FileName":"Collection",
          "FileUrl":"http://1400155958.vod2.myqcloud.com/xxx/xxx/aHjWUx5Xo1EA.mp4",
          "MetaData":{
              "AudioDuration":243,
              "AudioStreamSet":[
                  {
                      "Bitrate":125599,
                      "Codec":"aac",
                      "SamplingRate":48000
                  }
              ],
              "Bitrate":1459299,
              "Container":"mov,mp4,m4a,3gp,3g2,mj2",
              "Duration":243,
              "Height":1080,
              "Rotate":0,
              "Size":44583593,
              "VideoDuration":243,
              "VideoStreamSet":[
                  {
                      "Bitrate":1333700,
                      "Codec":"h264",
                      "Fps":29,
                      "Height":1080,
                      "Width":1920
                  }
              ],
              "Width":1920
          },
          "AiRecognitionResultSet":[
              {
                  "Type":"FaceRecognition",
                  "FaceRecognitionTask":{
                      "Status":"SUCCESS",
                      "ErrCode":0,
                      "Message":"",
                      "Input":{
                          "Definition":10
                      },
                      "Output":{
                          "ResultSet":[
                              {
                                  "Id":183213,
                                  "Type":"Default",
                                  "Name":"John Smith",
                                  "SegmentSet":[
                                      {
                                          "StartTimeOffset":10,
                                          "EndTimeOffset":12,
                                          "Confidence":97,
                                          "AreaCoordSet":[
                                              830,
                                              783,
                                              1030,
                                              599
                                          ]
                                      },
                                      {
                                          "StartTimeOffset":12,
                                          "EndTimeOffset":14,
                                          "Confidence":97,
                                          "AreaCoordSet":[
                                              844,
                                              791,
                                              1040,
                                              614
                                          ]
                                      }
                                  ]
                              },
                              {
                                  "Id":236099,
                                  "Type":"Default",
                                  "Name":"Jane Smith",
                                  "SegmentSet":[
                                      {
                                          "StartTimeOffset":120,
                                          "EndTimeOffset":122,
                                          "Confidence":96,
                                          "AreaCoordSet":[
                                              579,
                                              903,
                                              812,
                                              730
                                          ]
                                      }
                                  ]
                              }
                          ]
                      }
                  }
              }
          ],
          "TasksPriority":0,
          "TasksNotifyMode":""
      }
    }
    

    In the callback result, ProcedureStateChangeEvent.AiRecognitionResultSet contains the result of face recognition (Type is FaceRecognition).

    According to the content of Output.ResultSet, two people are recognized: John Smith and Jane Smith. SegmentSet indicates when (from StartTimeOffset to EndTimeOffset) and where (coordinates specified by AreaCoordSet) the two people appear in the video.

    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support