Release Notes and Announcements
- Release Notes
- Announcements
- Release Notes
Product Introduction
Purchase Guide
- Purchase Instructions
- Purchase a TKE General Cluster
- Purchasing Native Nodes
- Purchasing a Super Node
Getting Started
Cluster Configuration
- General Cluster Overview
- Cluster Management
- Network Management
- Storage Management
- Node Management
- GPU Resource Management
- Remote Terminals
Application Configuration
- Workload Management
- Service and Configuration Management
- Component and Application Management
- Auto Scaling
- Container Login Methods
Observability Configuration
- Ops Observability
- Cost Insights and Optimization
Scheduler Configuration
- Scheduling Component Overview
- Resource Utilization Optimization Scheduling
- Business Priority Assurance Scheduling
- QoS Awareness Scheduling
Security and Stability
- TKE Security Group Settings
- Identity Authentication and Authorization
- Application Security
Multi-cluster Management
- Planned Upgrade
- Backup Center
Cloud Native Service Guide
- Cloud Service for etcd
- TMP
- TKE Serverless Cluster Guide
- TKE Registered Cluster Guide
Use Cases
- Cluster
- Serverless Cluster
- Scheduling
- Security
- Service Deployment
- Network
- Release
- Logs
- Monitoring
- OPS
- Terraform
- DevOps
- Auto Scaling
- Containerization
- Microservice
- Cost Management
- Hybrid Cloud
- AI
Troubleshooting
API Documentation
- History
- Introduction
- API Category
- Making API Requests
- Elastic Cluster APIs
- Resource Reserved Coupon APIs
- Cluster APIs
- Third-party Node APIs
- Relevant APIs for Addon
- Network APIs
- Node APIs
- Node Pool APIs
- TKE Edge Cluster APIs
- Cloud Native Monitoring APIs
- Scaling group APIs
- Super Node APIs
- Other APIs
- Data Types
- Error Codes
- TKE API 2022-05-01
FAQs
- TKE General Cluster
- TKE Serverless Cluster
- About OPS
- Hidden Danger Handling
- About Services
- Image Repositories
- About Remote Terminals
- Event FAQs
- Resource Management
Service Agreement
- TKE Service Level Agreement
- TKE Serverless Service Level Agreement
Contact Us
Glossary

Pod Remains in Terminating

Focus Mode

Font Size

Last updated: 2024-12-13 14:48:39

This article describes the causes that lead to a Pod remaining in the Terminating status and how to troubleshoot these issues. Refer to the following instructions for troubleshooting.
Possible Causes
Insufficient disk space
Files with the i attribute exist
A bug in Docker version 17
Finalizers exist
A bug in earlier versions of kubelet list-watch
Dockerd status and containerd status is not in sync
A bug in Daemonset Controller
Troubleshooting
Checking if disk space is sufficient
If the disk where the Docker data directory resides is full, Docker will not function properly. It cannot even delete or create containers. Therefore it cannot respond to kubelet’s call to delete containers. Use kubectl describe pod <pod-name> to query event and get the following messages:
Normal  Killing  39s (x735 over 15h)  kubelet, 10.179.80.31  Killing container with id docker://apigateway:Need to kill Pod
For solutions and more information, see Disk Full.
Checking to see if files with the i attribute exist
Error description
Use man chattr to display a description of the i attribute, as shown below:
       A file with the 'i' attribute cannot be modified: it cannot be deleted or renamed, no link can be created to this file and no data can be written to the file.  Only the superuser or a process possessing the CAP_LINUX_IMMUTABLE capability can set or clear this attribute.
Note: 
If the container image file itself or files stored in the container have the i attribute, they cannot be modified or deleted. 
When Pods are deleted, container directories are cleaned. If the directories have files that cannot be deleted, the directories cannot be deleted, which causes the Pods to remain in the Terminating status. In this case, kubelet displays the following error message:
Sep 27 14:37:21 VM_0_7_centos kubelet[14109]: E0927 14:37:21.922965   14109 remote_runtime.go:250] RemoveContainer "19d837c77a3c294052a99ff9347c520bc8acb7b8b9a9dc9fab281fc09df38257" from runtime service failed: rpc error: code = Unknown desc = failed to remove container "19d837c77a3c294052a99ff9347c520bc8acb7b8b9a9dc9fab281fc09df38257": Error response from daemon: container 19d837c77a3c294052a99ff9347c520bc8acb7b8b9a9dc9fab281fc09df38257: driver "overlay2" failed to remove root filesystem: remove /data/docker/overlay2/b1aea29c590aa9abda79f7cf3976422073fb3652757f0391db88534027546868/diff/usr/bin/bash: operation not permitted
Sep 27 14:37:21 VM_0_7_centos kubelet[14109]: E0927 14:37:21.923027   14109 kuberuntime_gc.go:126] Failed to remove container "19d837c77a3c294052a99ff9347c520bc8acb7b8b9a9dc9fab281fc09df38257": rpc error: code = Unknown desc = failed to remove container "19d837c77a3c294052a99ff9347c520bc8acb7b8b9a9dc9fab281fc09df38257": Error response from daemon: container 19d837c77a3c294052a99ff9347c520bc8acb7b8b9a9dc9fab281fc09df38257: driver "overlay2" failed to remove root filesystem: remove /data/docker/overlay2/b1aea29c590aa9abda79f7cf3976422073fb3652757f0391db88534027546868/diff/usr/bin/bash: operation not permitted
Solution
Permanent solution: do not store files with the i attribute in container images or set a launched container with the i attribute.
Temporary solution:
1.1 Use the file path in the kubelet log and run the command chattr -i <file>, as shown below:
chattr -i /data/docker/overlay2/b1aea29c590aa9abda79f7cf3976422073fb3652757f0391db88534027546868/diff/usr/bin/bash
1.2 Wait for kubelet to restart and try again. You can delete the Pod now.
Checking for the bug in Docker Version 17
Error description
Docker hangs without any response. Running kubectl describe pod <pod-name> returns the following results:
Warning FailedSync 3m (x408 over 1h) kubelet, 10.179.80.31 error determining status: rpc error: code = DeadlineExceeded desc = context deadline exceeded
The cause is likely to be a bug in Docker version 17. You can use kubectl -n cn-staging delete pod apigateway-6dc48bf8b6-clcwk –force –grace-period=0 to force delete the pod, but you can still see it using docker ps.
Solution
Upgrade Docker to version 18. Version 18 uses a new dockerd version and fixed many bugs.
If the problem persists, submit a ticket for further assistance. We do not recommend that you force delete the pod as this may impact your business.
Checking for Finalizers
Error description
If a Kubernetes resource has the finalizers metadata, it is created by an application and the finalizers field contains an identifier of the application. For example, Rancher-created resources have the finalizers identifier.
To delete this type of resource, the application responsible must clean them up and remove the finalizers identifiers before they can be deleted.
Solution
Use kubectl edit to manually edit the resources to remove finalizers before deleting them.
Check for a bug in an earlier version of kubelet list-watch
We discovered that, when you use Kubernetes v1.8.13, kubelet list-watch has a bug that prevents kubelet from receiving event information after deleting a Pod, which means the Pod is not truly deleted. This leads to the Pod remaining in the Terminating status.
Refer to Updating Clusters for instructions on how to update Kubernetes.
Checking if dockerd and containerd are synchronized
Error description
If you use the AUFS storage driver and the disk is full, the kernel may panic and output the following error message:
aufs au_opts_verify:1597:dockerd[5347]: dirperm1 breaks the protection by the permission bits on the lower branch
If this happens, it may lead to status synchronization issues, and dockerd logs may contain records similar to the following:
Sep 18 10:19:49 VM-1-33-ubuntu dockerd[4822]: time="2019-09-18T10:19:49.903943652+08:00" level=error msg="Failed to log msg \\"\\" for logger json-file: write /opt/docker/containers/54922ec8b1863bcc504f6dac41e40139047f7a84ff09175d2800100aaccbad1f/54922ec8b1863bcc504f6dac41e40139047f7a84ff09175d2800100aaccbad1f-json.log: no space left on device"
Analysis
You can use one of the following methods to find out if dockerd and containerd are in sync.
Use describe pod to obtain the ID of the container. Then, use docker ps to query the status of the container and see if it matches the status from dockerd.
Use docker-container-ctr to query the container status in containerd, as shown below:
$ docker-container-ctr --namespace moby --address /var/run/docker/containerd/docker-containerd.sock task ls |grep a9a1785b81343c3ad2093ad973f4f8e52dbf54823b8bb089886c8356d4036fe0
a9a1785b81343c3ad2093ad973f4f8e52dbf54823b8bb089886c8356d4036fe0    30639    STOPPED
If the status of the container in containerd is stopped or empty and it is running in dockerd, then the container status is not synchronized between dockerd and containerd.
Solution
Temporary solution: run docker container prune or restart dockerd.
Permanent solution: use containerd instead of both containerd and dockerd to work around the bug in dockerd.
Checking for the Daemonset Controller bug
Kubernetes 1.10 and 1.11 have a bug that causes Daemonset Pod to remain in the Terminating status. In this case, Daemonset Controller reuses the predicates logic of scheduler which sorts the nodeSelector array (passed as pointer parameters) from nodeAffinity. This results in spec being different from that stored by apiserver. At the same time, Daemonset Controller uses spec to calculate the hash of Daemonset for version control purposes.
This difference in parameter values causes the Pod to get stuck in a loop of launching and stopping.
Solution
Temporary solution: make sure rollingUpdate Daemonset uses nodeSelector rather than nodeAffinity.
Permanent solution: refer to Updating Clusters for instructions on how to update Kubernetes to 1.12.

Help and Support

Was this page helpful?

You can also Contact sales or Submit a Ticket for help.

Help us improve! Rate your documentation experience in 5 mins.

Feedback

tencent cloud

Tencent Kubernetes Engine

Pod Remains in Terminating

Possible Causes

Troubleshooting

Checking if disk space is sufficient

Checking to see if files with the i attribute exist

Error description

Solution

Checking for the bug in Docker Version 17

Error description

Solution

Checking for Finalizers

Error description

Solution

Check for a bug in an earlier version of kubelet list-watch

Checking if dockerd and containerd are synchronized

Error description

Analysis

Solution

Checking for the Daemonset Controller bug

Solution

Help and Support