tencent cloud

TDSQL Boundless

V21.6.x

Download
Focus Mode
Font Size
Last updated: 2026-06-01 09:59:23

V21.6.1.0

Version Release Notes

Database Management

Added the hot data monitoring capability.
Added the information_schema.META_CLUSTER_HOT_DATA_OBJECTS_HISTORY system table to record hotness statistics for data objects, including metrics such as byte rate, key rate, query rate, and hotness level. Concurrently, the metric statistics and reporting mechanism of the TDStore module have been optimized to enhance the collection and monitoring of key performance indicators like bytes_read, bytes_write, keys_read, and keys_write. This helps MC quickly identify and optimize hot data, thereby improving the comprehensiveness and accuracy of database Ops monitoring.
Optimized the RG scheduling policy during online DDL operations
During fast online DDL operations, if no new resource group is created, the system restricts scheduling operations for the resource group where the Region resides, ensuring the stability and reliability of the DDL operations.
Optimized Slow Query Log Monitoring Granularity
Enhance the monitoring capability of TDSQL slow query logs to achieve more precise RPC performance issue identification and optimization through fine-grained parameter control.
Enhanced Performance Schema's monitoring capability for DDL rollback operations
Fixed an issue where the MDL held by a DDL statement's worker thread during rollback operations could not be observed in the performance_schema.metadata_locks system table. This fix improves the transparency and troubleshooting capability of DDL operations.
Supports specification modification and memory release during scale-in scenarios
Added a scale-in scenario API that supports pre-modifying instance specifications (memory and CPU cores) and proactively releasing idle cached memory before scale-in, thereby improving scale-in success rate and resource utilization.
Supports automatic detection of instance specification changes and adjustment of table distribution
When an instance undergoes a specification change, the system automatically identifies and recalculates the distribution policy for partition tables and non-partition tables. This achieves even data distribution across resources, improving resource utilization efficiency and query performance.
Supports cross-node join queries for system tables and global configuration for memory statistics tables
For in-memory tables such as information_schema.processlist, a shard table attribute is added. A parallel framework is used to generate parallel plans and obtain data from all nodes. This enables aggregate operations like GROUP BY and ORDER BY, as well as JOIN operations on such tables, to be performed at a global level. A unified aggregate result across all nodes is returned, instead of independent result sets from each node, meeting the requirements for cross-node statistical analysis.
The tables performance_schema.data_locks, performance_schema.data_lock_waits, information_schema.processlist, information_schema.tdstore_part_ctx, information_schema.tdstore_compaction_history, information_schema.tdstore_active_compaction_stats, information_schema.tdstore_sst_props, and information_schema.tdstore_cf_options are in shard mode by default, querying data from all nodes. If a query on these tables fails to generate a parallel plan, an error is reported to the client. Users can disable tdsql_enable_shard_table to obtain data from a single node.
Shard tables and non-shard tables cannot be mixed in a query. Specifically, JOIN operations between shard tables and non-shard tables are not supported.
Added the hint /*+ shard_table */ to optimize all tables in the statement in shard mode.
Enhanced audit log query capability, supporting cross-node aggregate queries of audit logs from all Hyper Nodes via SQL statements
The database audit feature is extended by providing a unified SQL query API. This API allows users to obtain audit log information distributed across all Hyper Nodes with a single query statement. This feature simplifies operations for security auditing, Behavior Analytics, or troubleshooting in a distributed architecture, enhancing the convenience of Ops and management.
Enhanced the EXPLAIN ANALYZE output by adding a VERBOSE option that displays detailed performance data for each Worker thread in PQ.
The display of PQ plans in EXPLAIN ANALYZE is optimized. The output now includes detailed information such as the actual execution time and the number of rows processed for each parallel Worker thread. It also supplements key nodes like sorting, hash joins, and aggregation with customized performance metrics, including memory usage and temporary table size. This feature provides a powerful tool for in-depth analysis of load balancing and resource consumption at each stage of PQ.

Scalability and Performance

Optimized binlog Conversion Latency Performance
Significantly reduces binlog conversion latency to the millisecond level and improves data synchronization efficiency by shortening the binlog flush interval, optimizing RG wait relationships, and accelerating MC Job progression.
Improved Replay Speed for Disaster Recovery Instances
Optimized the default value of the log_receiver_get_raft_log_base_size parameter to accelerate Raft log retrieval, thereby improving the replay performance of disaster recovery instances.
Introduced Hypergraph Optimizer Support
Added baseline code for the Hypergraph optimizer to provide more advanced underlying architectural support for query optimization, enhancing optimization capabilities and processing efficiency in complex query scenarios.
Enables the ingest-behind backfill mode by default for ADD INDEX operations on partition tables to improve performance.
Optimizes the performance of index creation on partition tables by default using the ingest-behind backfill method, accelerating the index building process for large tables.
Optimized brpc stream Transmission Reliability
Adds a retry mechanism for brpc stream transmission to automatically retry failed requests under network exception scenarios, improving data transmission success rates and system stability.
Optimized the Accuracy of TDStore Heartbeat Reporting Metrics
Improves the statistical method for metrics such as bytes/keys in heartbeat reporting by switching to counting the actual scanned data volume, making the metrics more accurately reflect the real pressure on nodes.
Enhanced Features for Serverless Scenarios
Added support for the Serverless instance type, enabling dynamic adjustment of computing resources based on actual loads to optimize resource utilization efficiency. Additionally, the mcIdentifier field has been added to backup filenames, providing a more precise backup identifier to facilitate user management and traceability of backup files.
Added MC One-to-Many Management Capability
This update implements a one-to-many management capability for MC (Metadata Controller) clusters to reduce resource costs for small-scale instances, improve MC cluster resource utilization, and lay the foundation for future multi-tenant solutions. This capability allows a single MC cluster to manage multiple HyperNode (data computing) clusters. Key changes include: 1) MC architecture adjustments to support managing metadata and scheduling for multiple sub-clusters (Instances) under a single RaftCluster; 2) Provision of APIs for sub-cluster registration, independent configuration management, freezing, and uninstallation; 3) Optimization of the initialization process to support dynamic onboarding of new clusters. This feature is a foundational capability in the roadmap, with future enhancements planned for advanced features such as resource isolation and migration.
Supports Route Query Merging and Multi-Key-Value Enhancement
Implements a route query merging mechanism that consolidates multiple similar routing requests for processing in high-concurrency scenarios, effectively reducing concurrent pressure on the MC and enhancing overall system performance and stability. This feature also supports both key list and key range list parameters, improving the flexibility and performance of distributed queries and enabling support for more complex multi-key-value routing scenarios.
Supports Grouped Execution of Parallel Query Worker Threads
SQLEngine now supports assigning parallel query tasks to dedicated groups of worker threads for execution, effectively improving the processing performance of complex queries and avoiding resource contention between different types of queries.
Supports brpc Group Observability
Adds monitoring and observability capabilities for brpc groups, providing more granular runtime status information to aid in troubleshooting and performance analysis.
Limit the statistical count of index table Block ranges
Limit the number of Block Range Stats per index table to reduce memory usage and improve query performance.
Supports Row-Store Large Transaction Writes
Breaks through the original transaction size limitation, significantly increases the data capacity that can be processed per transaction by optimizing storage and logging mechanisms, and meets the demands of large-scale data writes and high-complexity business scenarios.
Region Metadata Architecture Optimization
Moves Region metadata management from the upper layer down to the TDStore storage layer, achieving persistent storage and primary/secondary consistency for Range metadata. This effectively reduces interaction overhead between the upper and storage layers, decreases routing table size, improves Region split/merge performance, and optimizes overall system data consistency and operational efficiency.
Migrate the persistent data of CDC nodes from data_db to the independent binlog_info_db
To enhance the standardization and compatibility of the storage architecture, the column family (CF) used by CDC (Change Data Capture) nodes to record information such as Binlog synchronization positions is migrated from the shared data_db to the newly created, independent binlog_info_db. The upgrade process includes automatic data migration.
Optimizes the performance of data preparation during snapshot installation (Install Snapshot) to avoid large-scale, forced L0->L1 compaction.
Optimizes the snapshot data preparation process to address the issue of prolonged blocking and write throttling caused by forced manual compaction from L0 to L1 when snapshots are installed (such as for migration and splitting) on nodes with large volumes of data at the L0/L1 layers. The new solution avoids large-scale forced compaction, thereby improving snapshot preparation efficiency and system stability.
Implemented a Series of Performance Optimizations for the TDStore KV Separation Storage Engine
A series of in-depth optimizations have been implemented to address the issue of degraded read/write performance in KV separation architectures after prolonged operation, which can be caused by an excessive number of Blob files and high Compaction pressure. These optimizations include improving the timing and efficiency of Blob file merging and garbage collection (GC), supporting different compression algorithms for different layers, and reducing lock holding times on critical paths. This has significantly improved stability and throughput under high loads.

Syntax and Features

Optimizes the Execution Policy for flush table Operations in DDL Broadcasts
In DDL broadcast scenarios, the flush table operation now uses the DDL-first policy by default, improving the execution efficiency of DDL operations and system stability.
Deprecates the parameter tdsql_enable_ddl_first_strategy. Whether to enable the DDL-first policy is now controlled by the parameters tdsql_ddl_block_mode and tdsql_ddl_recovery_block_mode.
Table Scan Row Count Statistics
The SQL engine provides a feature for counting scanned rows of base tables. It supports accurately recording the number of rows scanned during query execution, helping users analyze query performance and execution plan efficiency.
Fixes PQ Result Inconsistency Issues
Fixed the issue where PQ auto mode produced results inconsistent with serial queries, ensuring the correctness and reliability of the PQ feature.
Adds the Parameter tdsql_bulk_load_allow_sk to myloader
When myloader is used to import data, a new parameter tdsql_bulk_load_allow_sk (default false) is added to control whether secondary indexes are allowed to exist during the import process.
Introduces the PQ_DISTRIBUTE optimizer Hint, providing the ability to customize execution plans for complex PQ.
Adds the PQ_DISTRIBUTE Hint syntax, which allows experienced users or DBAs to customize data distribution and computation policies (such as Gather, Repartition, and Broadcast) for specific operations like JOIN, GROUP BY, ORDER BY, or window functions during parallel execution. This provides advanced control for optimizing complex PQ plans and handling special data distribution scenarios.
Provides a mechanism that allows users to specify stored procedure functions (SP Functions) for pushdown execution to parallel workers.
To enhance the flexibility of PQ, a new method is added to declare the pushdown policy for functions in parallel plans via function comments (COMMENT). Users can add a string such as TDSQL_PROPERTY: {"parallel_safe": "safe"} in the comment to mark a custom stored procedure function, which is not pushable by default, as safely pushable. This enables the function to participate in parallel computation, optimizing query performance.
Extends PQ Capability to Support the Execution of Non-Scalar Uncorrelated Subqueries
The PQ engine's capabilities have been enhanced to support the parallel execution of complex SQL statements containing non-scalar (multi-row, multi-column, and using the materialization execution strategy in the plan) and uncorrelated subqueries. By introducing new parameters (PQ_PRE_EVALUATION and PQ_INLINE_EVALUATION) for the SUBQUERY Hint and leveraging internal temporary tables to share subquery results, the processing efficiency of such queries in distributed environments has been improved.

Stability

Fixed the Issue Where SET PERSIST Failed to Synchronously Update Some binlog_mysql-Related Configuration Variables
Fixed the issue where not all related binlog_mysql configuration items were synchronously updated when persistently modifying database parameters using the SET PERSIST command, ensuring the completeness and consistency of configuration changes.
Optimization of Logservice Playback Logic and Performance Enhancement of Subscription Tasks
Optimizes the Logservice playback logic switching mechanism, supporting dynamic switching between the old and new playback logic via the dbms_admin.switch_logservice_replay_logic function to ensure primary/secondary data consistency and enhance playback stability. Additionally, performance optimization is conducted for subscription tasks in scenarios with a large number of DDLs. By marking DML transactions generated by online DDL during Raft log writes and removing unnecessary waiting mechanisms, binlog dump latency is effectively reduced.
Fixed the crash issue caused by thread initialization exceptions.
Fixed the crash issue caused by global system variables being null during thread initialization, ensuring threads initialize normally.
Supports Independent Configuration of Transaction Commit RPC Timeout Parameters
A new parameter is introduced to support setting an independent timeout for transaction commit RPCs. This effectively avoids commit timeout issues caused by TDStore's delayed writes, thereby enhancing transaction stability.
Optimized Connection Flapping Issues During Vertical Scaling
This update provides targeted optimization for the issue where the number of connection interruptions during vertical scaling exceeds expectations (for example, 3 interruptions occur in a 3-node scenario instead of the expected 1). The goal is to reduce or smooth the impact on client connections during the scaling process, thereby enhancing service continuity and user experience during resource elasticity adjustments.
Extended the Memory Statistics and Limitation Scope of LogReceiver
This update extends the memory statistics and limitation scope of the LogReceiver component for the Raft Log retrieval process. The previous mechanism only covered the link for retrieving logs from COS. Now, the links for retrieving logs from other TDStore nodes are also incorporated into the unified memory statistics and limitation process. This optimization aims to more comprehensively control memory usage during log recovery, avoid memory pressure caused by log retrieval, and enhance cluster stability in scenarios such as node recovery and replica migration.

Backup and Restore

Optimized the Data Distribution Balance Across Nodes for Physical Backup Restoration
Fixed the load balancing issue in the node selection algorithm during physical backup restoration. This prevents data from being concentrated on a few nodes and significantly improves the overall speed of large-scale data recovery.
Optimized Parameter Consistency for Restoration Instances
Restoration instances now automatically inherit the parameter configuration from the source instance. This ensures that the restored database instance maintains parameter consistency with the source instance, thereby improving the accuracy and consistency of data recovery.
Supports Incremental Backup Data Integrity Verification
MC-Agent provides a data integrity verification API. You can use this API to verify the integrity of MC incremental backup data, ensuring the backup data is accurate and reliable, thereby enhancing the security and trustworthiness of backup and recovery.

Security Enhancements

Supports SSL Connection Encryption
This update introduces a database connection SSL encryption feature. It supports encrypted protection for data transmission, thereby enhancing connection security.

Operations and Maintenance

Added DBBrain Deadlock Visualization Diagnostic Capability
This update adds a "deadlock" diagnostic item to DBBrain's exception diagnosis. This feature is based on the system table information_schema.TDSTORE_PESSIMISTIC_DEADLOCK_DETAIL_INF.
O (kernel version ≥ 21.2.0) can query and display detailed information about deadlock cycles. This information includes the rollback transaction ID, the requesting/blocking transaction ID, the involved lock scope, and the corresponding SQL statements. This helps users quickly identify the business SQL and resource contention points that cause the deadlock, thereby improving troubleshooting efficiency.
Introduces full-link diagnostic and tracing (Trace) capabilities to optimize problem localization efficiency.
A structured full-link tracing framework is introduced into the database. It supports automatic recording of complete call chain information for slow queries (such as those exceeding 1 second). Visual diagnostic reports can be generated via system tables or tools, helping Ops and development personnel quickly locate performance bottlenecks and exceptions during SQL execution, RPC communication, and internal processing.
Set timeout alarms and rollback mechanisms for Shark disaster recovery switchover/disconnection tasks
To address the risk of the primary instance remaining in a read-only state for an extended period due to tasks getting stuck during disaster recovery switchover or disconnection processes, a task timeout alarm mechanism has been established, and an automatic rollback process for failure or timeout scenarios has been introduced. This optimization ensures that the primary instance's write capability can be promptly restored under abnormal conditions, thereby enhancing the reliability of disaster recovery operations.
Added MC-aware specification change and automatic table distribution adjustment feature
This update introduces a new feature: the Metadata Controller (MC) can detect vertical scaling (specification change) events of a cluster and automatically trigger adjustments to the data distribution of partition tables and non-partition tables. This feature aims to reduce manual intervention after specification changes and enhance the automation level of cluster resource utilization and CLB.
Enhanced MC Registered Node Specification Management
Optimize the MC node registration process, add an RPC version verification mechanism, and enhance the security and compatibility of node registration.
Added monitoring metrics and alarm capabilities for RG log file size
A new monitoring metric, rep_group_log_size_bytes, is added. The MC Agent reports the local log file size of each Raft Node. This metric can be used to monitor the growth trend of log files. When a file continues to increase in size (which typically indicates log cleanup failure), an alarm can be triggered. This helps Ops personnel promptly identify log accumulation issues caused by backup, CDC, or Follower exceptions.
Optimize the default storage path for slow logs
Modify the default storage path for slow logs to the log disk. This avoids scheduling misjudgments caused by sharing storage with the data layer and enhances system stability and log management efficiency.

Bug Fixes

Fixes the node crash issue that occurs when Raft log storage is converted from multi-raft-db to segment mode, caused by the RG ID exceeding the int32 range.
Fixes the issue where the INFORMATION_SCHEMA.LOGSERVICE_MYSQL_UNSUPPORTED_TABLE view incorrectly includes tables of the view type.
Fixes the system crash issue caused by routing changes during batch key range construction.
Fixes the issue of hotspot scheduling failure for instances during restoration, ensuring balanced data distribution after restoration.
Fixes the issue where the tdsql_lock_wait_timeout parameter for row-level lock timeout does not take effect. The actual timeout is limited by the RPC timeout, not the parameter value.
Fixes potential deadlock issues that may occur when LogService synchronizes unique keys (UK).
Fixes the issue of inaccurate display of candidate values for some system variables. This includes the character set client variable displaying too many invalid character sets, the utf8mb4 collation variable displaying collations not belonging to that character set, and the truncation of formatted double-precision boundary values.
Fixes the deadlock risk present during TDRlogBackuper destruction, ensuring that all worker sockets are correctly interrupted when the process exits.
Fixes the crash issue that occurs when a process exits. The crash is caused by accessing already released resources due to TDRlogBackuper not being closed in advance.
Fixes the issue of CLS synchronization latency caused by a conflict between CDC log-receiver relocation triggered by automatic load balancing and DDL table creation constraints in sporadic scenarios.
Fixes the issue where a disaster recovery instance fails to establish a new primary/secondary relationship after a switchover/failover and disconnection from the original primary instance, due to gaps in the MC log (Rlog). The solution includes providing a data integrity verification API and triggering a full backup when data is incomplete.
Fixes the issue where the SQLEngine process fails to start normally in containerized deployment environments. This occurs when a node's log disk becomes completely full, causing log initialization to fail during the startup process.
Fixes the issue where partition table Leaders fail to automatically redistribute after a user switches from "Single RG Mode" (all Leaders concentrated on one node) to "Leader Scatter Mode". This issue occurs because the mode switch operation does not trigger the internal Leader rebalancing check logic. As a result, Leaders remain concentrated on the original node, preventing the utilization of all node resources and affecting CLB and performance.
Fixes the issue where a Replica Group (RG) fails to correctly release placeholders after a BulLoad task with the Split feature enabled is completed. This issue causes the RG's resources to remain occupied, preventing the execution of subsequent tasks such as migration and load balancing.
Fixes the issue where creating a data object fails due to the excessively long time taken to persist the disk leader in the MC component.
Fixes the issue where executing the DROP DATABASE operation on a multi-node instance may cause residual data dictionary table data to remain on other nodes.
Fixes version compatibility issues in the internal table upgrade logic, ensuring that system tables are updated at the correct version stage.
Fixes the issue where bulk load split tasks are incorrectly marked as historical tasks while still in an incomplete state.
Fixes the issue where role information is lost when the Raft log crosses a snapshot during promote/demote operations.
Fixes the issue in the query optimizer where the force switch of parallel_query_switch fails to enforce a parallel plan due to an underestimated number of scan rows (cost rows).
Fixes the issue where incorrect handling of NULL values in row value constructor subqueries leads to erroneous query results.
Fixes the issue in the Stop EngineAgent script where the function to obtain the process ID returns an incorrect value due to residual processes.
Fixes the issue where the switch is not guaranteed to remain disabled throughout the cluster cloning process.
Fixes the issue where a merge operation fails due to a lack of replica role consistency check when empty resource groups are merged.
Fixes the defect in the node resource group cache cleanup mechanism where node records are not correctly cleaned up after the passive abort phase ends.
Fixes the issue where a leader switch fails due to the MC continuously issuing identical leader switch tasks in a short period during DP rule changes.
Fixes the issue where hotspot scheduling, by default, only references bytes as the metric, resulting in an insufficiently comprehensive scheduling policy.
Fixes the issue where the data balancing feature migrates excessive data from the source node during the migration process.
Fixes the issue where a broadcast resource group gets stuck during a configuration change when a broadcast synchronization table is used in an n+1 architecture and the quorum value exceeds the number of nodes n.
Fixes the issue where an incorrect timeout configuration is used for RPC calls between TDStores.

Syntax Changes

Change Type
Syntax
Description
Addition
SELECT * FROM tdsql_force_dist_shard(information_schema.processlist);
Adds the tdsql_force_dist_shard keyword, which applies to a table, indicating that a parallel plan is generated in shard table mode to obtain global data. It is typically used for in-memory tables such as processlist.

Data Dictionary Change

Change Type
System Tables and System Views
Description
Modification
Changed from NULL to the name of the database corresponding to the table for the OBJECT_SCHEMA column.
Changed from NULL to the table name for the OBJECT_NAME column.
Addition
After the work thread grouping feature is introduced, the VIEW_RESOURCE_TAG view is used to record the work thread grouping information of each business type configured in the system, including the group identifier, number of threads, and configuration status, helping users understand the system resource allocation.
Addition
After the work thread grouping feature is introduced, the VIEW_RESOURCE_TAG_THREAD_INFO view is used to monitor the task queue status at the work thread level in real time. This view displays the task backlog of each work thread, helping users gain an in-depth understanding of the internal task scheduling and load distribution of the system.
Modification
Renamed the timestamp field to occurred_at and changed its type from varchar to timestamp.
Modification
Renamed the timestamp field to occurred_at and changed its type from varchar to timestamp.

Help and Support

Was this page helpful?

Help us improve! Rate your documentation experience in 5 mins.

Feedback