Release Notes
Product Introduction
- Overview
- Scenarios
- Product Architecture
- Instance Types
- Compatibility Notes
- Usage Recommendations
Kernel Features
- Kernel Overview
- Kernel Version Release Notes
- Functionality Features
- Performance Features
- Security Features
Billing
Getting Started
- Creating Instances
- Connect to Instances
User Guide
- Use Limits
- Operation Overview
- Data Migration
- Data Subscription
- Instance Management
- Configuration Change
- Multi-AZ Deployment
- Disaster Recovery Instance
- Monitoring and Alarms
- Parameter Configuration
- Account Management
- Network and Security
- Backup and Restoration
- Log Center
- Database Auditing
- Tag Management
- Serverless Service
- Read-Only Analysis Instance
Use Cases
Developer Guide
- Developer Guide (MySQL Compatibility Mode)
- Developer Guide (HBase Compatibility Mode)
Performance Tuning
- Performance Tuning Overview
- SQL Tuning
- DDL Tuning
Performance White Paper
API Reference
- History
- Introduction
- API Category
- Making API Requests
- Instance APIs
- Security Group APIs
- Task APIs
- Data Security Encryption APIs
- Account APIs
- Backup APIs
- Restore APIs
- Parameter APIs
- Database APIs
- Log APIs
- Other APIs
- Data Types
- Error Codes
General Reference
- System Architecture
- SQL Reference
- Database Parameter Description
- TPC-H benchmark data model reference
- Error Code Information
Security and Compliance
FAQs
Agreements
Contact Us
Glossary

Distributed Transactions and Data Affinity

Unduh

Mode fokus

Ukuran font

Terakhir diperbarui: 2026-04-17 12:21:17

This document details the implementation principles of distributed transactions in TDSQL Boundless and explains how data affinity scheduling optimizes transaction performance.
Distributed Transaction Principles
Overview of Transaction Models
Distributed transaction design aims to ensure both correctness and efficiency. TDSQL Boundless has comprehensively optimized the read/write operations and commit pathways of transactions.
﻿
﻿
﻿
Read/Write Phase
Participant Management Mechanism
Replication Group as the Unit: Manages transaction participants with the Replication Group as the fundamental unit.
Automatic Context Creation: Automatically creates corresponding participant contexts when a transaction accesses data in a specific Replication Group.
Distributed Caching: The SQLEngine layer does not cache transaction data; instead, TDStore implements distributed caching.
Transaction Execution Flow
1. Obtain Read Timestamp: When the transaction executes the first statement, it obtains a timestamp from MC as the read snapshot.
2. Read/Write Operations: Access the Leader replica of the Replication Group corresponding to the data and create a participant context on that replica.
3. Transaction Commit: Select one participant to act as the coordinator and send a commit request to the coordinator.
Data Caching Mechanism
Feature
Description
Pre-Commit Caching
Transaction data is fully cached in memory before commit.
SQLEngine Does Not Cache
The SQLEngine layer does not cache transaction data.
Distributed Caching
TDStore implements distributed caching.
Persistence on Commit
Data persistence is performed only during the transaction commit phase.
commit phase
2PC Offloading
TDSQL Boundless completely offloads the implementation of 2PC to the TDStore layer, and the SQLEngine is unaware of the transaction commit process.
﻿
﻿
﻿
Commit Process:
1. When a distributed transaction is submitted, the SQLEngine selects one of the participants as the coordinator.
2. Send a commit request to the selected participant node, which contains a list of all participants involved in the transaction.
3. The node receiving the request creates a coordinator context within its replication group.
4. The coordinator is responsible for sending read/write requests to other participants and driving the complete 2PC process.
Data Persistence
TDStore directly uses Raft log as the WAL; writing data to the LSM-Tree does not require an additional WAL:
1. Upon node restart, replay the Raft Log from the last recorded point.
2. After data is flushed to disk, advance the log position to reduce the amount of logs that need to be replayed during a crash.
3. A single Log achieves both data synchronization for standby machines and failure recovery features.
Log-based vs. Negotiation-based 2PC
TDSQL Boundless adopts collaborative 2PC, offering significant advantages over traditional logging-based 2PC:
﻿
﻿
﻿
Item
logging-based 2PC
Collaborative 2PC (TDSQL Boundless)
RPC Rounds
2 rounds (prepare, commit)
3 rounds (prepare, commit, asynchronous clear)
Log Synchronization Count
Participant 2 times + Coordinator 3 times = 5 times
Participants synchronize 3 times (including 1 asynchronous).
Troubleshooting
After a crash restart, both the participants and coordinator can independently determine the transaction's final state by replaying their local logs.
After a crash restart, participants can determine the state by replaying local logs; the coordinator needs to determine the state based on messages received from participants.
Core Advantage:
Using collaborative 2PC to avoid the overhead of the coordinator's log synchronization.
Ensuring atomicity of transactions across Replication Groups.
The number of log synchronizations is reduced from 5 to 3.
data affinity scheduling
Affinity Definition
Data Affinity: There exists correlation between data items, for example:
Primary keys and secondary indexes within a table
Join keys between different tables
Scheduling related data to the same replication group enables transaction optimization.
Typical Case: Affinity within a Table
Scenario: The user creates a table containing ID, name, and age columns, and creates a secondary index on the age column.
﻿
Problems with Traditional Solutions:
The key ranges of primary key indexes and secondary indexes are far apart.
Distributed transactions must be completed via 2PC.
Transaction execution efficiency is significantly lower than that of single-machine transactions.
TDSQL Boundless Optimization Solution:
Place related data (primary keys and secondary indexes) in the same replication group.
Read and write queries only need to send requests to the target replication group.
Transaction commits only need to interact with a single replication group.
The 1PC protocol significantly improves transaction processing efficiency.
Data-aware scheduling
﻿
Typical Case Studies:
1. The user creates a table containing ID, Name, and Age columns (Table ID: 100).
2. Create a secondary index on the age column (index table ID: 200).
3. The primary key index key prefix is 00000064, and the secondary index key prefix is 000000C8; their key ranges are far apart.
Typical Case: Cross-Table Affinity
Scenario: Small data volume merchant balance account tables coexist with large data volume merchant transaction account tables.
﻿
Data Growth Processing Flow:
1. Initial State: Two tables coexist on the same node.
2. Data Growth: Tables begin to automatically split Regions (at different speeds).
3. Ensure Correlation: Perform automatic splitting of Table 2's Regions and split the replication groups accordingly.
4. Data Range Alignment: RG1 manages data range 0-5; RG2 manages data range 5-9.
5. When load increases: RG2 is migrated to other nodes to achieve data splitting while ensuring affinity.
Affinity Benefits Summary
﻿
﻿
﻿
Benefits
Description
Data Localization
Data frequently accessed together is stored on the same node.
Reduce RPC
Read and write operations entail less data movement and fewer RPC calls.
Avoid Distributed Transactions
Operations within a single replication group require no 2PC.
Significantly Improve Performance
Significantly reduces latency and network overhead.

Bantuan dan Dukungan

Apakah halaman ini membantu?

Anda juga dapat Menghubungi Penjualan atau Mengirimkan Tiket untuk meminta bantuan.

masukan

Feature	Description
Pre-Commit Caching	Transaction data is fully cached in memory before commit.
SQLEngine Does Not Cache	The SQLEngine layer does not cache transaction data.
Distributed Caching	TDStore implements distributed caching.
Persistence on Commit	Data persistence is performed only during the transaction commit phase.

Item	logging-based 2PC	Collaborative 2PC (TDSQL Boundless)
RPC Rounds	2 rounds (prepare, commit)	3 rounds (prepare, commit, asynchronous clear)
Log Synchronization Count	Participant 2 times + Coordinator 3 times = 5 times	Participants synchronize 3 times (including 1 asynchronous).
Troubleshooting	After a crash restart, both the participants and coordinator can independently determine the transaction's final state by replaying their local logs.	After a crash restart, participants can determine the state by replaying local logs; the coordinator needs to determine the state based on messages received from participants.

Benefits	Description
Data Localization	Data frequently accessed together is stored on the same node.
Reduce RPC	Read and write operations entail less data movement and fewer RPC calls.
Avoid Distributed Transactions	Operations within a single replication group require no 2PC.
Significantly Improve Performance	Significantly reduces latency and network overhead.

tencent cloud

TDSQL Boundless

Distributed Transactions and Data Affinity

Distributed Transaction Principles

Overview of Transaction Models

Read/Write Phase

Participant Management Mechanism

Transaction Execution Flow

Data Caching Mechanism

commit phase

2PC Offloading

Data Persistence

Log-based vs. Negotiation-based 2PC

data affinity scheduling

Affinity Definition

Typical Case: Affinity within a Table

Data-aware scheduling

Typical Case: Cross-Table Affinity

Affinity Benefits Summary

Bantuan dan Dukungan