tencent cloud

Feedback

Kernel Release Notes

Last updated: 2022-06-28 09:36:21

Tencent Cloud Elasticsearch Service (ES) team has been continuously optimizing the ES kernel based on its extensive practical experience in large-scale applications while remaining fully compatible with the open-source Elasticsearch kernel, in an effort to improve cluster performance and stability and reduce costs. In addition, the team keeps up with latest updates in the community. This document describes the major kernel optimizations of ES.

Major optimizations in April 2022:

Optimization Category Optimization Policy Supported Version
Performance Time series index query clipping is optimized, shifting from large-scale traversal to fixed-point boundary clipping and increasing the high-dimensional time series search performance by over ten times. 7.14.2
DSL query results can be returned in columns, which greatly reduces the duplicate key redundancy, lowers the network bandwidth usage by 35%, and increases the performance by 20%. 7.14.2
The serialization of transparent data transfer between nodes is optimized, reducing the redundant serialization costs and increasing the query performance by 30%. 7.14.2
The X-Pack authentication performance is optimized. CPU hotspots are eliminated through special permission processing, caching, and delayed loading, improving the query performance by over 30%. 7.10.1, 7.14.2
The query performance is optimized in fine-grained block-level sampling, increasing the estimated query performance of operators such as topk, avg, min, max, and histogram by over ten times. 7.14.2
Feature The query preference parameters are optimized. `_shards` and `custom_string` can be used in combination to fix primary and replica shards, which ensures stable query results in scoring scenarios. 7.14.2
The truncation of super-long content of keyword fields is optimized, so that such content can be written without an truncation exception reported. 7.14.2
The underlying fine-grained control of query timeout is optimized to avoid the further occupation of cluster resources by a large number of canceled or timed-out queries (the queries should carry the `timeout` parameter). 7.10.1, 7.14.2
Stability The memory leak issue in specific memory usage throttling scenarios during the query process is fixed, and the memory usage throttling policy is further optimized to avoid OOM errors in aggregation scenarios and enhance the cluster stability. 7.14.2
The issue of repeated join and removal of nodes leaving the cluster is fixed to increase the cluster stability. 7.10.1, 7.14.2
The node-level and index-level shard balancing policies are optimized to improve the shard balancing capabilities and eliminate load hotspotting. 7.10.1, 7.14.2
The shard relocation and balancing policies are optimized for multi-disk scenarios to improve the shard relocation performance. 6.8.2, 7.10.1, 7.14.2
The shard start and the priority of failed shard tasks are optimized to avoid prolonged index unavailability. 6.8.2, 7.10.1, 7.14.2
The cluster scalability performance is optimized, with the shard quantity and node expansion capabilities greatly increased, many metadata changes implemented, and cluster restart performance multiplied. 7.14.2
Security The Log4j vulnerability is fixed. All versions

Major optimizations in February 2021:

Optimization Dimension Optimization Category Optimization Policy Supported Version
Performance Write performance Shard-targeted routing is optimized, solving the long-tail shard issue in the writing process in single-index multi-shard scenarios. This also increases the write throughput by over 10% and reduces the CPU usage by over 20%. 6.8.2, 7.5.1, 7.10.1
Query performance Query performance is improved by over 10% by cropping the query results, instead of using `filter_path`. 6.8.2, 7.5.1, 7.10.1
Stability Memory Node crashes and cluster avalanches caused by high-concurrent writes and large queries are significantly reduced, and the overall availability is increased to 99.99%.
  • High-concurrent write traffic is limited at the Netty network layer based on memory resources. The memory consumed by query and write request exceptions is quickly repossessed to avoid memory leak. The proprietary single request circuit breaker is optimized to prevent a large query from occupying excessive resources.
  • Based on GC management, nodes with completely occupied memory are automatically restarted in time. The Lucene file type memory mapping model can be configured to improve the system's memory usage in different business scenarios.
  • 6.8.2, 7.5.1, 7.10.1
    JDK, GC Tencent's proprietary KONA JDK11 is adopted and known JDK bugs are fixed, improving serial full GC capability. You can switch to the G1 collector to improve GC efficiency and reduce glitches caused by old GC. 6.8.2, 7.5.1, 7.10.1
    Metadata performance The priority of mapping update tasks is optimized, solving the issue where nodes cannot work properly due to excessive requests in the queue caused by high number of concurrent mapping update tasks. Metadata asynchronous storage is optimized and metadata synchronization performance is improved to avoid frequent timeouts of index creations and mapping updates. 6.8.2, 7.5.1, 7.10.1
    Costs Storage The zstd compression algorithm is adopted, increasing the compression ratio by 30% to 50% and the compression performance by 30%. 6.8.2, 7.5.1, 7.10.1

    Major optimizations as of July 2020 since the ES team restarted its kernel research:

    Optimization Dimension Optimization Category Optimization Policy Supported Versions
    Performance Write performance The translog lock mechanism is optimized, increasing the overall write performance by 20%. Write deduplication and segment file cropping are optimized, increasing the performance of writes with primary keys by over 50%. 7.5.1, 7.10.1
    Query performance
  • The aggregation performance is optimized, making query pruning more efficient and improving the composite aggregation performance by 3 to 7 times in sorting scenarios.
  • The query cache is optimized by canceling data caches with high overheads and low hit rates, reducing query glitches from 750 ms to 50 ms in actual use cases.
  • The merge policies are optimized by developing proprietary merge policies based on time series and size similarity and auto warm shard merge policy, improving the query performance by over 40% in search scenarios.
  • Sequence capture in the query fetch phase is optimized, increasing the cache hit rate and improving the performance by over 10% in scenarios where the result set is large.
  • 6.4.3, 6.8.2, 7.5.1, 7.10.1
    Stability Availability
  • Traffic can be limited through a smooth line curve at the access layer.
  • The coordinator node performs memory bloat estimation after receiving results returned by the data node to check whether the estimated memory will exceed the limit.
  • Result sets of large aggregated queries are checked in a streaming manner, and requests will be canceled if the used memory reaches the threshold.
  • The proprietary single request circuit breaker can prevent a large query from occupying excessive resources and thus affecting other queries.
  • Node crashes and cluster avalanches caused by high-concurrence writes and large queries are significantly reduced, and the overall availability is increased to 99.99%.
  • 6.4.3, 6.8.2, 7.5.1, 7.10.1
    Balancing policy
  • Balancing policies based on index and node distribution are introduced, alleviating the serious uneven allocation of shards caused by new nodes added to the cluster.
  • The uneven allocation of shards among multiple disks (multiple data directories) is alleviated.
  • The balance of shards of newly created indices in cluster scale-out scenarios and multiple-disk scenarios is improved, reducing Ops costs.
  • 5.6.4, 6.4.3, 6.8.2, 7.5.1, 7.10.1
    Rolling restart speed
  • The logic of reusing local data for shards in case of node restart is optimized.
  • The restoration of shard copies within a scheduled delay time period can be precisely controlled. The time to restart one single node in a large cluster is reduced from over 10 minutes to 1 minute.
  • 6.4.3, 6.8.2, 7.5.1, 7.10.1
    Online master switch The proprietary online master switch feature allows you to switch the master online in seconds by specifying the preferred master through APIs. Typical use cases include:
  • You can switch online from the current heavily loaded master to a node with a higher specification and a lower load during manual Ops.
  • During rolling restart, you can restart the master node last and quickly switch the master role to another node before the restart, which helps reduce the service interruption from minutes to seconds.
  • 6.4.3, 6.8.2, 7.5.1, 7.10.1
    Costs Memory
  • The proprietary off-heap cache helps achieve FST off-heap optimization.
  • The off-heap cache ensures that the FST reclaim policy is controllable.
  • The precise eviction policy improves the cache hit rate.
  • Zero-copy and multi-level caches guarantee high access performance.
  • The heap memory overheads are significantly reduced, the GC time is decreased by over 10%, and the disk capacity of a single node can reach 50 TB, with read/write performance generally unaffected.
  • 6.8.2, 7.5.1, 7.10.1
    Storage
  • The proprietary ID field-based row storage cropping algorithm reduces storage overheads by over 20% in time series scenarios.
  • 5.6.4, 6.4.3, 6.8.2, 7.5.1, 7.10.1
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support