tencent cloud

Feedback

Hive Catalogs

Last updated: 2023-11-07 16:26:48

    Overview

    You can configure and use Hive catalogs, and view Hive metadata of a SQL job in the Stream Compute Service console. After metadata is stored in a Hive metastore, you have no need to explicitly declare the DDL statements in the job and can directly reference the metadata in the three-segment format.

    Hive ‍support

    Flink Version
    Description
    1.11
    Not supported
    1.13
    Hive v2.2.0, v2.3.2, v2.3.5, and v3.1.1 supported
    1.14
    Not supported

    Prerequisites

    You have activated the Hive metastore service on the Hive metastore. The related commands are as follows:
    hive --service metastore: Activate the Hive metastore service.
    ps -ef|grep metastore: Check whether the service is successfully activated.

    Directions

    Creating a Hive catalog

    Switch to the _dc ‍directory, and click Create Hive catalog. Upload the configuration files hive-site.xml (adding urls to it), hdfs-site.xml, hivemetastore-site.xml, and hiveserver2-site.xml (download them here).

    Creating a database

    You can create databases in a SQL job. The database reference uses a two-segment format of catalog_name.database_name.
    CREATE DATABASE IF NOT EXISTS `hiveCatalogName`.`databaseName`;

    Creating a table

    You can create tables in a SQL job. The table reference uses a three-segment format of catalog_name.database_name.table_name.
    CREATE TABLE IF NOT EXISTS `hiveCatalogName`.`databaseName`.`tableName` (
    user_id INT,
    item_id INT,
    category_id INT,
    -- ts AS localtimestamp,
    -- WATERMARK FOR ts AS ts,
    behavior VARCHAR
    ) WITH (
    'connector' = 'datagen',
    'rows-per-second' = '1', -- The number of records generated per second
    'fields.user_id.kind' = 'sequence', -- Whether a bounded sequence (if yes, the output automatically stops after the sequence ends)
    'fields.user_id.start' = '1', -- The start value of the sequence
    'fields.user_id.end' = '10000', -- The end value of the sequence
    'fields.item_id.kind' = 'random', -- A random number without range
    'fields.item_id.min' = '1', -- ‍The minimum random number
    'fields.item_id.max' = '1000', -- The maximum random number
    'fields.category_id.kind' = 'random', -- A random number without range
    'fields.category_id.min' = '1', -- The minimum random number
    'fields.category_id.max' = '1000', -- The maximum random number
    'fields.behavior.length' = '5' -- Random string length
    );

    Referencing a Hive catalog table in a SQL job

    Place the cursor to the position where a metadata table is to be inserted in a SQL job, find the target table on the left sidebar, and click Operation > Reference.
    INSERT INTO
    `hiveCatalogName`.`databaseName`.`sink_tableName`
    SELECT
    *
    FROM
    `hiveCatalogName`.`databaseName`.`source_tableName`;
    Note
    You ‍can reference only one Hive catalog in a job.
    The DROP operation is unavailable on a Hive catalog.

    Deleting a Hive metastore

    On the left sidebar, click Delete of the Hive catalog to be deleted.

    Granting permissions

    To use Hive catalogs in Stream Compute Service, access to HDFS files is required during the job execution, and the related Flink user must be granted the access permissions. The details are as follows:
    Execute the following commands on all Hive master nodes.
    useradd flink
    groupadd supergroup
    usermod -a -G supergroup flink
    hdfs dfsadmin -refreshUserToGroupsMappings
    We recommend you grant the permissions in Hive, and add the following configuration options to hive-site.xml:
    <property>
    <name>hive.metastore.authorization.storage.checks</name>
    <value>true</value>
    <description>Should the metastore do authorization checks against
    the underlying storage for operations like drop-partition (disallow
    the drop-partition if the user in question doesn't have permissions
    to delete the corresponding directory on the storage).</description><property>
    
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support