SELECT Object Content

Last updated: 2020-10-26 16:08:20

    Overview

    This API is used to extract content from the specified object (in CSV or JSON format) using Structured Query Language (SQL) statements. To send this request, you need to specify the content delimiter and use an appropriate SQL function. COS Select will return matched extraction results in a format you specified for saving.

    For more information on COS Select, see COS SELECT Overview. For more information on SQL expressions in COS Select, see SELECT Command in the Developer Guide.

    Note:

    The Select Object Content API currently only supports virtual-hosted requests, but not path-style requests.

    Permission restrictions

    To use COS Select, you must have the permission to cos:GetObject.

    • If you are using a root account, you have the permission by default.
    • If you are using a sub-account, contact your root account to get the permission to this operation. For more information on permission settings, see Granting Sub-accounts Access to COS.

    Object formats

    COS Select supports extracting data from objects in the following formats:

    • CSV: an object is stored in CSV format with its data records separated with a specific delimiter.
    • JSON: an object is stored in JSON format, which can be either a JSON file or a JSON list.
    • Parquet: an object is stored in Parquet format, which can contain nested structures.

    Note:

    • To use COS Select, the object must be UTF-8 encoded.
    • COS Select supports extracting data from CSV- and JSON-formatted objects compressed using GZIP or BZIP2, and Parquet-formatted objects compressed using GZIP or Snappy.
    • COS Select supports extracting data from objects encrypted with SSE-COS.

    Request

    Sample request

    POST /<ObjectKey>?select&select-type=2 HTTP/1.1
    Host: <BucketName-APPID>.cos.<Region>.myqcloud.com
    Date: date
    Authorization: Auth String
    
    Request body

    Note:

    • Authorization: Auth String (see Request Signature for details).
    • The request parameters select and select-type=2 are required, where the former represents the initiation of a select request, and the latter represents the version information of the API.

    Request headers

    This API only uses common request headers. For more information, see Common Request Headers.

    Request body

    The following sample shows how to initiate a COS Select request to extract all the content from a CSV-formatted object and save the result as a CSV-formatted object.

    <?xml version="1.0" encoding="UTF-8"?>
    <SelectRequest>
        <Expression>Select * from COSObject</Expression>
        <ExpressionType>SQL</ExpressionType>
        <InputSerialization>
            <CompressionType>GZIP</CompressionType>
            <CSV>
                <FileHeaderInfo>IGNORE</FileHeaderInfo>
                <RecordDelimiter>\n</RecordDelimiter>
                <FieldDelimiter>,</FieldDelimiter>
                <QuoteCharacter>"</QuoteCharacter>
                <QuoteEscapeCharacter>"</QuoteEscapeCharacter>
                <Comments>#</Comments>
                <AllowQuotedRecordDelimiter>FALSE</AllowQuotedRecordDelimiter>
            </CSV>
        </InputSerialization>
        <OutputSerialization>
            <CSV>
                <QuoteFields>ASNEEDED</QuoteFields>
                <RecordDelimiter>\n</RecordDelimiter>
                <FieldDelimiter>,</FieldDelimiter>
                <QuoteCharacter>"</QuoteCharacter>
                <QuoteEscapeCharacter>"</QuoteEscapeCharacter>
            </CSV>
        </OutputSerialization>
        <RequestProgress>
            <Enabled>FALSE</Enabled>
        </RequestProgress>
    </SelectRequest> 

    The following sample shows how to initiate a COS Select request to extract all the content from a JSON-formatted object and save the result as a JSON-formatted object.

    <?xml version="1.0" encoding="UTF-8"?>
    <SelectRequest>
        <Expression>Select * from COSObject</Expression>
        <ExpressionType>SQL</ExpressionType>
        <InputSerialization>
            <CompressionType>GZIP</CompressionType>
            <JSON>
                <Type>DOCUMENT</Type>
            </JSON>
        </InputSerialization>
        <OutputSerialization>
            <JSON>
                <RecordDelimiter>\n</RecordDelimiter>
            </JSON>                                  
        </OutputSerialization>
        <RequestProgress>
            <Enabled>FALSE</Enabled>
        </RequestProgress>                                  
    </SelectRequest> 

    The following sample shows how to initiate a COS Select request to extract all the content from a Parquet-formatted object and save the result as a JSON-formatted object.

    <?xml version="1.0" encoding="UTF-8"?>
    <SelectRequest>
        <Expression>Select * from COSObject</Expression>
        <ExpressionType>SQL</ExpressionType>
        <InputSerialization>
            <CompressionType>GZIP</CompressionType>
            <Parquet>
            </Parquet>
        </InputSerialization>
        <OutputSerialization>
            <JSON>
                <RecordDelimiter>\n</RecordDelimiter>
            </JSON>                                  
        </OutputSerialization>
        <RequestProgress>
            <Enabled>FALSE</Enabled>
        </RequestProgress>                                  
    </SelectRequest> 

    Note:

    • InputSerialization is a required element that specifies the format for the object to be extracted. It can be set to CSV, JSON or Parquet.
    • OutputSerialization specifies the format in which extraction results are to be saved. It can be set only to CSV or JSON.
    • The two formats above do not need to be the same. For example, you may extract data from an object in JSON format and save the extraction result in CSV format.

    The following table shows all elements in a request body:

    Node Name Parent Node Description Type Required
    Expression SelectRequest An SQL expression that represents the extraction operation to initiate, such as SELECT s._1 FROM COSObject s which extracts the first column of data from a CSV-formatted object. For more information on SQL expressions, see SELECT Command. String Yes</