This document introduces the scheduling rule system of TDSQL Boundless, including task orchestration framework, K8s semantic policy API, and implicit/explicit affinity configuration.
Orchestrated task execution framework
TDSQL Boundless designed an execution plan-based orchestratable task framework:
Scenario Example: Schedule table1 and table2 to the same node.
1. Execute splitting action: Issue a "Split RG2" atomic operation to TDStore.
2. Execute migration: Issue "Migrate Region3 from Node2 to Node1".
3. Primary switch: Issue "switch the primary replica of the relevant RG".
4. Merge: Issue "Merge the specified RG".
Typical Business Scenarios
|
Transaction Records, Bills | Historical data occupies expensive storage. | Built-in policy for automatic hot/cold tiering |
Financial Multi-Legal-Entity | Regulatory requirements mandate physical isolation of data. | Fine-grained topology-aware data isolation policy |
Advertising Search | Cross-table JOIN queries require co-locating the data. | Topology-aware automatic scheduling of data |
Policy API System with K8s Semantics
Implementing a Scalable K8s-Compliant Policy API System Based on Data Topology Awareness:
Policy Example: Bind the cross-3-idc policy when creating a table.
CREATE TABLE table_1 USING DISTRIBUTION_POLICY cross-3-idc
Definition:
The table has three replicas: one primary and two secondary.
Distributed across gz_1 - gz_5 AZs.
Leader should avoid selecting gz_3 or gz_5.
Prioritize ensuring disaster recovery capability at the zone level.
Hot and Cold Data Tiering
CREATE TABLE table_1(id INT, trans_date DATETIME)
PARTITION BY RANGE (TO_DAYS(trans_date)) (
PARTITION p1 VALUES LESS THAN (TO_DAYS('2022-11-01')),
PARTITION p2 VALUES LESS THAN (TO_DAYS('2022-12-01')),
PARTITION p3 VALUES LESS THAN (TO_DAYS('2023-02-01')),
PARTITION p4 VALUES LESS THAN (TO_DAYS('2023-03-01'))
);
ALTER TABLE table_1
USING DISTRIBUTION_POLICY partition-cool-down
Effect: Subpartitions older than 3 months are automatically migrated to cold storage nodes with reduced replicas, without manual intervention.
Affinity scheduling
Implicit affinity
For tables with identical partitioning rules, TDSQL Boundless introduces the Partition-Level Affinity Scheduling Mechanism:
CREATE TABLE t1(id INT PRIMARY KEY, f1 INT, f2 VARCHAR)
PARTITION BY HASH (f1) PARTITIONS 3;
CREATE TABLE t2(id INT PRIMARY KEY, f1 INT, f3 VARCHAR)
PARTITION BY HASH (f1) PARTITIONS 3;
Effect:
Pins associated partitions (such as t1.p3 and t2.p3) on the same node
Prevents data dispersion caused by scaling out or load balancing.
Ensure stable query performance.
Explicit Affinity
Allows services to proactively configure inter-table relationships via SQL syntax:
Scenario 1: Close Association of Small Tables
CREATE PARTITION POLICY IF NOT EXISTS pp1
Scheduling Effect: Assigns two tables to the same replication group (RG).
Scenario 2: Partition-Level Affinity for Large Tables
CREATE PARTITION POLICY IF NOT EXISTS pp2
PARTITION BY HASH(INT) PARTITIONS
Scheduling Effect: Schedules partitions with identical partitioning rules from the two tables to the same RG.
Core Value of the Policy System
|
General Standard | Complies with factual standards, the key-op-values specification is user-friendly. |
Semantically Extensible | Can extend more PolicyOptions based on different business requirements. |
Fully Decoupled Data Link | Scheduling engine independently managed, compute/storage fully decoupled. |
Business Value Cases
Tencent Billing Platform Invoice Transactions:
Data Scale: 30T (three replicas), single table with 8 billion+ rows
Query Frequency: Average 300-400 transactions/s
Original Pain Points: Requires convenient and rapid scaling out
Effectiveness: Online horizontal scaling reduced from months to minutes, without business awareness.