Help & DocumentationElasticsearch ServiceFAQsFAQs About Write Rejection (Bulk Reject)

FAQs About Write Rejection (Bulk Reject)

Last updated: 2019-11-15 11:34:37

PDF

Issues

In some cases, the bulk rejection rate of a cluster increases. Specifically, an error message such as the following will appear when bulk writes are performed:

[2019-03-01 10:09:58][ERROR]rspItemError: {"reason":"rejected execution of org.elasticsearch.transport.TransportService$7@5436e129 on EsThreadPoolExecutor[bulk, queue capacity = 1024, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@6bd77359[Running, pool size = 12, active threads = 12, queued tasks = 2390, completed tasks = 20018208656]]","type":"es_rejected_execution_exception"}
  • You can see that the bulk rejection rate has increased in Cloud Monitor.
  • You can also view the number of bulk writes that are being rejected or have been rejected by running the following command in the Kibana Console.
    GET _cat/thread_pool/bulk?s=queue:desc&v
    Generally, the default value for a queue is 1024. If there is 1024 under a queue, rejections have occurred on the node.

Cause identification

Bulk rejections are typically caused by a too large shard capacity or uneven allocation of shards. The specific cause can be identified and analyzed by following the steps below.

1. Check whether the shard size is too large

A too large shard size may result in bulk rejections; therefore, you are recommended to limit the size of a shard to 20-50 GB. You can view the size of each shard in the index by running the following command in the Kibana console.

GET _cat/shards?index=index_name&v

2. Check whether the shards are unevenly distributed

Sometimes, the shards may be unevenly distributed across all the nodes in the cluster. Some nodes are allocated with too many shards, while some too few.

  • You can check that in Cluster Monitoring > Node Status on the cluster details page in the ES Console. For more information, see Viewing Monitoring Metrics.
  • You can also view the number of shards allocated to each node in the cluster using the curl client.
    curl "$p:$port/_cat/shards?index={index_name}&s=node,store:desc" | awk '{print $8}' | sort | uniq -c | sort
    The results are as follows (the first column shows the number of shards, and the second shows the node ID), where some nodes are allocated with one shard, while some eight.

Solution

1. Set the shard size

The shard size can be configured using the number_of_shards parameter in the index template (after the template is created, it will take effect when you create new indexed, and previous indices will not be adjusted).

2. Fix the uneven distribution of shards

  • Temporary solution:
    If you find that shards are not evenly allocated, you can dynamically adjust a certain index by setting the routing.allocation.total_shards_per_node parameter. For more information, see here.

    A certain buffer should be reserved for total_shards_per_node so as to prevent any machine failure from rendering allocation of shards impossible (for example, if there are 10 machines and an index has 20 shards, total_shards_per_node should be set to above 2, such as 3).

    Reference command:
    PUT {index_name}/_settings
    {
      "settings": {
        "index": {
          "routing": {
            "allocation": {
              "total_shards_per_node": "3"
            }
          }
        }
      }
    }
  • Set an index before production:
    Set the number of shards per node through the index template.
    PUT _template/{template_name}
    {
      "order": 0,
      "template": "{index_prefix@}*",  // Prefix of the index to be adjusted
      "settings": {
        "index": {
          "number_of_shards": "30",   // Specify the number of shards allocated to the index based on a shard size of about 30 GB
          "routing.allocation.total_shards_per_node":3  // Specify the maximum number of shards that a node can accommodate
        }
      },
      "aliases": {}
    }