tencent cloud

Feedback

Creating Data Consistency Check Task

Last updated: 2023-03-02 11:41:32

Overview

During data consistency check, DTS compares the table data between the source and target databases and outputs the comparison result and inconsistency details for you to quickly process the inconsistent data. A data consistency check task is independent and does not affect the normal business in the source database or other DTS tasks.
Notes
Sync links currently supporting data consistency check are as follows:
MySQL/MariaDB/Percona/TDSQL-C for MySQL > MySQL
MySQL/MariaDB/Percona > MariaDB
MySQL/MariaDB/TDSQL-C for MySQL > TDSQL-C for MySQL

Notes

During data consistency check, only the database/table objects selected in the source database are compared with those synced to the target database. The consistency is not checked for data written during sync, other advanced objects (such as procedures and functions), and views.
A data consistency check task may increase the load in the source database instance. Therefore, you need to perform such tasks during off-peak hours.
A data consistency check task can be executed repeatedly, but one DTS instance can initiate only one such task at any time.
A table to be checked must have a primary key or unique key; otherwise, it will be skipped by DTS during the check.
If you choose to stop a sync task before a data consistency check task is completed, the check task will fail.
As data consistency check requires creating a new database __tencentdb__ in the source database and writing the checksum table to the database, if the source database is read-only, data consistency check will be skipped.

Restrictions

Currently, check tasks are imperceptible to the DDL operations. If you perform DDL operations in the source database during sync, the check result will be inconsistent with the actual data, and you need to initiate another check task to get the accurate comparison result.
Data check is supported only for one-way and two-way syncs, but not those with complicated topologies such as many-to-one, one-to-many, ring, and star sync.
If only certain DML and DDL items are selected in the sync task configuration, or the Where condition is used for filtering, data in the source and target databases will become inconsistent, and consistency check will not be supported. Therefore, you need to select all DML and DDL items to perform a consistency check.
Note that when a data check task is created, the check result may be inconsistent with the actual data if the sync task is configured as follows.
Full data initialization is not selected for Initialization Type. In this case, the data in the source and target databases may be inconsistent, causing the check result to be Inconsistent.
Ignore is selected for Primary Key Conflict Resolution. In this case, the data in the source and target databases may become inconsistent upon a conflict, causing the check result to be Inconsistent.
For existing tasks created before the release of the consistency check feature on January 12, 2023, check tasks cannot be directly created as the DTS version is too earlier. To create a check task, submit a ticket for upgrade.

How it works

DTS consistency check on MySQL databases is based on the row mode (binlog_format=row) that can correctly copy the source and replica data to ensure the data security.
1. Create the checksum database __tencentdb__.Checksums in the source database to store the data comparison information during the sync task.
2. Select the non-null unique key of the target table as the fixed check field.
3. Calculate the crc1 checksum and row count count1 of the source database and write them into the __tencentdb__.Checksums of the source database.
Similar to chunk check, during CRC calculation, select a fixed range (for example, data with primary keys ranging from 1 to 1000 in table A) based on the fixed check field, splice the data by row, calculate the CRC value (crc) for each chunk, and then calculate crc1 for all the data in the source database.
4. DTS parses the binlog data in row mode, restores the SQL statement for writing the checksum into the source database, and replays the SQL statement in the target database.
In the target database, use the same variables as those in the source database to calculate the checksum and row count to get crc2 and count2.
5. Compare the checksum and row count values of the source and target databases and display the comparison result.

Creating a data consistency check task

1. Log in to the DTS console.
2. On the Data Sync page, select the target sync task, select Operation > More, and click Create Data Consistency Check Task.
3. Click Create Data Consistency Check Task.
Notes
A data consistency check task can be created only when the Source-Target Database Data Gap is smaller than 100 MB. If the button is grayed out, the sync task status does not meet the requirement; for example, only certain DML or DDL items are selected in the task configuration, the Where condition is set for filtering, the task fails, the source-target database data gap is greater than 100 MB, or the sync topology is complicated.
4. In the pop-up window, click OK.
5. After configuring data consistency check parameters, click Create and Start Consistency Check Task.
Parameter
Description
Check Object
All sync objects: Check all the objects selected in the sync task.
Custom: Check the selected sync objects.
Comparison Type
Full comparison: Check all the data for the selected objects.
Sampling: Check the data of a certain proportion (10%, 20%, 30%, ..., 90%) for the selected objects.
Row count comparison: Only compare the data row count for the selected check objects.
Thread Count
The value ranges from 1 to 5 and can be set as needed. Increasing the number of threads can speed up the consistency check, but will also increase the load of the source and target databases.

Viewing the data consistency check result

1. On the Data Sync page, select the target sync task, select Operation > More, and click Create Data Consistency Check Task.
2. In the Operation column, click View.
If the data is consistent, the result is as follows:
You can view the numbers of estimated tables, checked tables, inconsistent tables, and inconsistent chunks. Here, the number of estimated tables is an estimation that may differ from the actual value, as providing an accurate value will compromise the overall check performance.
Possible causes for not checking a table: There is no primary key or non-null unique key, the table is empty, the engine type is not supported, or the table does not exist.
For inconsistent data, you need to manually confirm the corresponding data content of the source and target databases. Specifically, compare the values based on Database, Data Table, Index Name, Last Index Key, and First Index Key parameters displayed on the page.
The steps are as follows:
1. Log in to the source database and query the prompted index range.
select * from table_name where col_index >=1 and <=5;
2. Log in to the target database and query the prompted index range.
3. Compare the data of the source and target databases.

Contact Us

Contact our sales team or business advisors to help your business.

Technical Support

Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

7x24 Phone Support