tencent cloud

Feedback

MergeTree

Last updated: 2024-01-19 16:45:30
The MergeTree table engine is used to analyze a very large amount of data. It supports data partitioning, primary key indexing, sparse indexing, data TTL, and other features.
Columnar storage: Only reads the required columns, reducing I/O and CPU usage.
Data partitioning: Cuts off data into multiple parts by date or other conditions to simplify data management and query.
Primary key indexing: Sorts data by primary key or sorting key to speed up range queries.
Secondary data-skipping indexes: Skips data that does not meet the conditions based on statistical information such as the minimum and maximum values of a column to further improve query efficiency.
Data merge: Periodically merges small data parts in the background to reduce data redundancy and fragmentation.
The MergeTree engine also has some variants, such as ReplicatedMergeTree, AggregatingMergeTree, and SummingMergeTree, which add data replication, data aggregation, data summation, and other features to the basic MergeTree features. The MergeTree engine supports all SQL syntax in ClickHouse, but there are differences in some features compared with standard SQL.
The following table describes the usage of MergeTree and its variants:
Family
Table Engine
Description
Reference
MergeTree
MergeTree
Used to insert a very large amount of data into a table. The data is quickly written to the table part by part, and the parts are merged according to rules.
ReplacingMergeTree
Used to remove duplicate entries with the same primary key.
CollapsingMergeTree
Used to eliminate the feature limitations of the ReplacingMergeTree table engine. It greatly reduces the volume of storage and increases the efficiency of SELECT query as a consequence.
VersionedCollapsingMergeTree
Serves the same purpose as CollapsingMergeTree but allows retention of data with the latest version.
SummingMergeTree
Used to summarize data with the same primary key.
AggregatingMergeTree
Used to aggregate data with the same primary key.
Note
In a production environment, a Replicated prefix needs to be added to the table engine name to represent multiple replicas. For more information, see Data Replication.
ReplicatedSummingMergeTree
ReplicatedReplacingMergeTree
ReplicatedAggregatingMergeTree
ReplicatedCollapsingMergeTree
ReplicatedVersionedCollapsingMergeTree
ReplicatedGraphiteMergeTree


Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

7x24 Phone Support