This article describes the causes that lead to Pods remaining in the Pending status and how to troubleshoot these issues. Refer to the following instructions for troubleshooting.
A Pending Pod has not been scheduled to a node. Use
kubectl describe pod <pod-name> to look up event information, which can be used to analyze the cause.
$ kubectl describe pod tikv-0 ... Events Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 3m (x106 over 33m) default-scheduler 0/4 nodes are available: 1 node(s) had no available volume zone, 2 Insufficient cpu, 3 Insufficient memory.
The following are likely causes of insufficient node resources:
Run the following command to query resource allocation information for further analysis:
kubectl describe node <node-name>
Focus on the following returned items to judge if a node has sufficient resources:
Allocatable: all resources the current node can apply for.
Allocated resources: resources that have been allocated (Allocatable minus all Requests by all Pods on the node).
The remaining resources a node has is equal to
Allocated resources. If it is less than the Request from the Pod, then the node does not have enough resources to accommodate the Pod, which means the Scheduler skips the Pod in the Predicates stage. Therefore, the pod is not scheduled to the node.
If the nodeSelector of a Pod specifies a label, the scheduler will only schedule the Pod to a node with that label. If no such node exists, the Pod will not be scheduled. For more information, refer to the official Kubernetes website.
If the Pod has affinity configured and the scheduler cannot find a node that satisfies the affinity conditions, the Pod is not scheduled. Affinity has the following types:
nodeAffinity: affinity to nodes. You can think of this as an enhanced version of nodeSelector. It limits the Pod to the nodes that meet certain conditions.
podAffinity: affinity to Pods. This schedules related Pods to the same node or nodes in the same availability zone.
podAntiAffinity: anti-affinity to pods. This is used to prevent the scheduling of the same type of Pods to the same place in order to avoid single point of failure. For example, you can schedule the Pods that provide DNS service to the cluster to different nodes in order to prevent the DNS service crashes causing business interruptions because a single node fails.
If a node has taints for which the Pod has no corresponding tolerations, the Pod will not be scheduled to that node. You can use
kubectl describe node <node-name> to query existing node taints, as shown below:
$ kubectl describe nodes host1 ... Taints: special=true:NoSchedule ...
You can add taints automatically or manually. For more information, refer to Adding Taints.
This article provides the following solutions. Solution 2 is the most often used.
kubectl taint nodes host1 special-
The following uses a Pod named
nginxas an example to describe how to add tolerations:
kubectl edit deployment nginx
templatesection. The following adds a toleration for the existing taint
The result is shown below:
tolerations: - key: "special" operator: "Equal" value: "true" effect: "NoSchedule"
There is a bug in earlier versions of
kube-scheduler that causes Pods to remain in the Pending status. You can solve the issue by upgrading kube-scheduler.
Check if the Master
kube-scheduler is running properly. If not, restarting the scheduler may solve the problem.
If a node fails after a service is deployed, the Pod is evicted and a new Pod is created and scheduled to another node. Pods with mounted disks mounted are usually scheduled to nodes in the same availability zone as the drained node and the disks. However, if the cluster does not have a node that meets the rescheduling requirements, these nodes are not scheduled even if there are nodes in other availability zones that meet the requirements.
The reason that Pods with disks mounted cannot be scheduled to nodes in other availability zones is as follows:
Cloud disks can be dynamically mounted to different machines in the same IDC. However, they are not allowed to be mounted to machines in other IDCs to avoid severe I/O degradation due to network latency.
Use the following command to add taints manually:
$ kubectl taint node host1 special=true:NoSchedule node "host1" tainted
In some cases, you may not want Pods to be scheduled to a new node before certain configurations are finished. In this case, you can add a taint called
node.kubernetes.io/unschedulableto the node.
Kubernetes v1.12 Beta provides the feature
TaintNodesByCondition. With this feature, controller manager will check conditions defined in the node when the node does not run properly. If a condition is met, then the corresponding taint is added automatically.
For example, if the condition of
OutOfDisk=true is met, then a taint called
node.kubernetes.io/out-of-disk is added to the node.
Conditions and corresponding taints:
Condition Value Taints -------- ----- ------ OutOfDisk True node.kubernetes.io/out-of-disk Ready False node.kubernetes.io/not-ready Ready Unknown node.kubernetes.io/unreachable MemoryPressure True node.kubernetes.io/memory-pressure PIDPressure True node.kubernetes.io/pid-pressure DiskPressure True node.kubernetes.io/disk-pressure NetworkUnavailable True node.kubernetes.io/network-unavailable
The following are descriptions of the conditions:
OutOfDiskis True, the node is out of storage space.
Readyis False, the node is unhealthy.
Readyis Unknown, the node is unreachable. If a node does not report to controller-manager in the time defined by
node-monitor-grace-period(40s by default), it is marked as Unknown.
MemoryPressureis True, the node has little available memory.
PIDPressureis True, the node has too many processes running and it is running out of PIDs.
DiskPressureis True, the node has little available storage space.
NetworkUnavailableis True, the node cannot communicate with other Pods because the network is not properly configured.
Taints are added if the above conditions are met. TKE also provides a process for automatically adding taints for uninitialized nodes:
A taint called
node.cloudprovider.kubernetes.io/uninitializedis added to a new node automatically and subsequently removed when the node is successfully initialized. This is to prevent Pods from being scheduled to an uninitialized node.