tencent cloud

Tencent Cloud TCHouse-C

DokumentasiTencent Cloud TCHouse-C

Using Tencent Cloud TCHouse-C for Real-Time Data Updates

Unduh
Mode fokus
Ukuran font
Terakhir diperbarui: 2026-06-18 17:33:07

Feature Overview

Tencent Cloud TCHouse-C supports the data real-time update feature, which provides full UPSERT semantics. When you use the UPSERT feature, data updates and deletions take effect synchronously.
Note:
This is a new feature of Tencent Cloud TCHouse-C. If needed, contact online customer service for consultation and usage.

Applicable Scenarios

The data real-time update of Tencent Cloud TCHouse-C takes effect synchronously. Its applicable scenarios are as follows:
For rows that meet certain conditions, modify the values of one or more fields (sort keys, partition keys, and unique keys cannot be modified).
Click Update or Delete, and use the WHERE clause to restrict the update or deletion to a small number of rows.
The UPSERT feature is supported only in MergeTree/ReplicatedMergeTree table engines defined with a UNIQUE KEY. Other table engines, such as replacingMergeTree or VersionedCollapsingMergeTree, do not support backup.

Performance

The performance of UPSERT is related to the number of rows to be updated, the cross-partition distribution of the data to be updated, and the data filtering conditions. Specifically:
In a single update operation, the smaller the number of rows to be updated, the lower the update cost.
In a single update operation, the fewer partitions the data to be updated belongs to (ideally, all data belongs to the same partition), the lower the update cost.
In a single update operation, the lower the execution cost of data filtering (the WHERE clause), the lower the update cost. (It is generally recommended that the WHERE clause includes sort key fields).

Must-Knows

When the real-time update feature is compared with open-source ClickHouse, you need to consider its particularities when using the real-time update feature.

Avoiding Concurrent Updates to the Same UNIQUE KEY

Tencent Cloud TCHouse-C uses Merge-On-Write to implement row-level data updates. A single update operation consists of two steps: data reading and data writing. If concurrent update operations are performed on the same UNIQUE KEY, result indeterminacy may occur.
To address this issue, a user-defined version field will be introduced in a future version to resolve data conflicts at the business level.

Using More Memory Resources

To ensure data uniqueness, Tencent Cloud TCHouse-C uses HASH indexes, which require a certain amount of memory. For 2 billion rows of data, using a UInt64 field as the key consumes approximately 70 GB of memory. Use a high-memory instance type whenever possible. Otherwise, if the memory is exhausted due to excessive data, there is a risk that the system may fail to start.

Avoiding Batch Updates

Tencent Cloud TCHouse-C implements data updates through Merge-On-Write. Large-batch updates amplify the latency of data writes to nodes. At the business level, use small-batch updates, such as point updates, whenever possible.

UNIQUE KEY Data Uniqueness Constraint Is Only Valid Within a Node

Tencent Cloud TCHouse-C maintains indexes within nodes to ensure the uniqueness of UNIQUE KEYs. If identical data is distributed across different nodes, the cluster cannot guarantee its uniqueness. At the business level, you need to write the same KEY to the same SHARD. Exercise caution when horizontal scaling is performed, as it will change the node-key mapping relationship.

Avoiding Multi-Partition Updates in a Single SQL Statement

To avoid excessive small files in the system and reduce update costs, it is recommended that a single SQL statement be used to update data within only one partition.

Operation Steps

Creating Tables

Below is an example:
CREATE TABLE IF NOT EXISTS realtime_insert(
id UInt32,
value1 UInt32,
value2 UInt32
) ENGINE = MergeTree()
UNIQUE KEY id
ORDER BY id;
A UNIQUE KEY definition indicates that the table will be created as an upsert MergeTree type table. The `id` field serves as the unique key, which is used to generate a unique index for the table data and to deduplicate data. If you do not specify a UNIQUE KEY definition, the table will not have the upsert feature and will be no different from a regular MergeTree table.
Note:
When a UNIQUE KEY is selected, it is recommended to use only one field. Using multiple fields as the unique key consumes more memory.
The unique key selected for the UNIQUE KEY must be the first column in the column set specified by ORDER BY or PRIMARY KEY, or a prefix set of that column set.
The column types supported for a UNIQUE KEY are limited to Int8/16/32/64/128, UInt8/16/32/64/128, UUID, IPV4, and String. Columns of other types cannot be used as a UNIQUE KEY. Additionally, the column type cannot be Nullable or an expression.
The projection feature is not supported on tables that have a UNIQUE KEY specified.
For tables with a specified UNIQUE KEY, only the MergeTree and ReplicatedMergeTree(beta) engines are supported. Other Engine types are not supported.
ReplicatedMergeTree tables that have a UNIQUE KEY specified do not support the ttl feature, nor do they support drop part or drop partition operations.
Column operations, such as adding new columns, deleting old columns, changing column types, or dropping columns, cannot be performed on tables that have a UNIQUE KEY specified.

INSERT INTO Real-Time Updates

Tables with a specified UNIQUE KEY support real-time deduplication during write operations:
insert into realtime_insert values (1,2,2);
insert into realtime_insert values (1,3,3);

9.0.16.14 :) select * from realtime_insert;
SELECT *
FROM realtime_insert

Query id: e19c1105-9b13-4d4f-810d-596bc9cfb62b
┌─id─┬─value1─┬─value2─┐
133
└────┴────────┴────────┘
If the values of the UNIQUE KEY fields for two write operations are identical, the data from the later write will overwrite the data from the earlier write, ensuring final data deduplication.

UPSERT Real-Time Updates

Tables with a specified UNIQUE KEY support the update statement for updating data within the table. Syntax:
update [db.]table set column1 = expr1 [, ...] WHERE filter_expr
Below is an example:
insert into realtime_insert values (1,3,3);
update realtime_insert set value1 = 4 where id = 1;
9.0.16.14 :) select * from realtime_insert;

SELECT *
FROM realtime_insert

Query id: 3972121e-6474-44ee-bd48-5168d2e54c74

┌─id─┬─value1─┬─value2─┐
143
└────┴────────┴────────┘
Note:
The fields selected by the UNIQUE KEY or PRIMARY KEY cannot be updated using the update syntax.
Updating a large volume of data consumes significant performance resources and results in a longer execution time.

DELETE Real-Time Deletion

Tables with a specified UNIQUE KEY support the DELETE statement for deleting data. Syntax:
delete [db.]table WHERE filter_expr
Below is an example:
insert into realtime_insert values (1,3,3);
insert into realtime_insert values (2,3,3);
delete from realtime_insert where id = 1;

9.0.16.14 :) select * from realtime_insert;

SELECT *
FROM realtime_insert

Query id: 107c7db9-072f-4295-a63d-2a8912cde1e1

┌─id─┬─value1─┬─value2─┐
233
└────┴────────┴────────┘
Note:
Deleting a large volume of data consumes significant performance resources and results in a longer execution time.


Bantuan dan Dukungan

Apakah halaman ini membantu?

masukan