tencent cloud

Feedback

Data Dictionary

Last updated: 2024-01-19 16:45:30

    Feature Overview

    ClickHouse offers internal dictionaries and external dictionaries. The former predefines the content and the latter offers various definitions from external data sources. You can use special functions along with a dictionary for query, which is simpler and more efficient than the combination of JOIN and a reference table. This document describes how to use external dictionaries in the console.

    Notes

    1. A data dictionary maintained in the console will take effect on all ClickHouse nodes.
    2. A data dictionary added in the console can only be maintained (modified and deleted) in the console.
    3. A data dictionary added using SQL commands or other backend methods cannot be maintained (modified and deleted) in the console.
    4. A dictionary table created in the console will be added to the Cloud Data Warehouse global database cdwch_{instanceId}_dictionary_database by default.
    Note:
    Do not maintain a dictionary via both the console and other means (such as SQL commands and other backend means). Otherwise, content may be overwritten or lost (such as due to accidental deletion).

    Directions in Console

    1. Log in to the Cloud Data Warehouse console, select the target cluster in Cluster List, and click the Data Dictionaries tab on the cluster details page.
    2. View the list of existing external dictionaries on the current page.
    
    3. Click Create Dictionary in the top-left corner, set the fields as prompted, and click OK to create a dictionary.
    Parameter
    Description
    Reference
    Dictionary Name
    Enter a custom external dictionary name, which can contain 2–16 lowercase letters, digits, and underscores and must start with a letter and end with a letter or digit.
    -
    Data Source
    Select the type of data source for the external dictionary. Currently, the Cloud Data Warehouse console supports the MySQLClickHouse data source.
    Data Source Connection
    Configure data source information to verify the connectivity of the configured external source. Configuration items include:
    
    HOST: IP address or domain name. Currently, only VPC connections are supported.
    TCP PORT: TCP port
    USER: external source account
    PASSWORD: password of the external source account
    -
    Source Table Information
    Select the databases and tables of the dependent external source and enter `WHERE` (table filter condition) and `INVALIDATEQUERY` (for querying/checking the dictionary status, with only the updated data extracted).
    Data Structure
    Set the primary key and general fields of the external dictionary, including:
    
    PRIMARY_KEY: single or composite primary key.
    COLUME_NAME: field type. Currently, the console supports the following types: UInt8, UInt16, UInt32, UInt64, Int8, Int16, Int32, Int64, Float32, Float64, Date, DateTime, UUID, and String.
    DEFAULT_VALUE: default value for empty fields
    EXPRESSION: expression to describe fields (if applicable)
    IS_HIERARCHICAL: indicates the support for hierarchy. The default value is `false`.
    S_INJECTIVE: indicates the inline mapping "id -> attribute". The default value is `false`.
    Storage Format
    Select the type of memory layout for the external dictionary, including:
    
    FLAT: stores the entire dictionary in memory as a flat array, which is suitable for a single primary key.
    HASHED: stores the entire dictionary in memory as a hash table, which is suitable for a single primary key.
    RANGE_HASHED: stores the entire dictionary in memory as a hash table. It comes with an ordered array of ranges and corresponding values, which is suitable for a single primary key. You need to set the fields to represent the ranges.
    CACHE: stores the entire dictionary in a cache with a certain number of cells, which is suitable for a single primary key. You need to set the cache size.
    COMPLEX_KEY_HASHED: similar to HASHED and suitable for composite primary keys.
    COMPLEX_KEY_CACHE: similar to CACHE and suitable for composite primary keys. You need to set the cache size.
    Update Interval
    Set the frequency of updating data in the dictionary. The unit is s.
    3. After the external dictionary is created, you can view, modify, and delete it in the list.

    Managing External Dictionary via SQL

    1. Create a dictionary.
    CREATE DICTIONARY [IF NOT EXISTS] [db.]dictionary_name [ON CLUSTER cluster]
    (
    key1 type1 [DEFAULT|EXPRESSION expr1] [IS_OBJECT_ID],
    key2 type2 [DEFAULT|EXPRESSION expr2],
    attr1 type2 [DEFAULT|EXPRESSION expr3] [HIERARCHICAL|INJECTIVE],
    attr2 type2 [DEFAULT|EXPRESSION expr4] [HIERARCHICAL|INJECTIVE]
    )
    PRIMARY KEY key1, key2
    SOURCE(SOURCE_NAME([param1 value1 ... paramN valueN]))
    LAYOUT(LAYOUT_NAME([param_name param_value])``)
    LIFETIME({MIN min_val MAX max_val | max_val})
    2. View the dictionary.
    SELECT * FROM system.dictionaries
    3. Delete the dictionary.
    DROP DICTIONARY <database name>.<dictionary name>
    For more information on dictionary SQL, see CREATE DICTIONARY.

    Using External Dictionary

    You can query some types of dictionaries using general SELECT:
    SELECT * FROM <database name>.<dictionary name>
    You can also query them using dictionary functions:
    dictGet('dict_name', attr_names, id_expr)
    dictGetOrDefault('dict_name', attr_names, id_expr, default_value_expr)
    dictGetOrNull('dict_name', attr_name, id_expr)
    For more information on dictionary functions, see Functions for Working with External Dictionaries.
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support