List of Monitoring and Alarm Metrics

Last updated: 2019-10-18 18:52:43

    Monitoring

    TKE currently provides monitoring metrics of the following dimensions. All metrics are average values within the granularity.

    Monitoring Metrics for Clusters

    Monitoring Metric Unit Description
    CPU Utilization % CPU utilization rate of entire cluster
    MEM Utilization % Memory utilization rate of entire cluster

    Monitoring Metrics for Master & Etcd and Ordinary Nodes

    Monitoring Metric Unit Description
    Re-startup of Pods restarts Sum of the number of restarts of all pods on the node
    Exception - Node status: normal or exceptional
    CPU Utilization % CPU usage of all pods on the node to the total CPU of the node
    MEM Utilization % Memory usage of all pods on the node to the total memory of the node
    Private bandwidth in bps Total private network inbound bandwidth of all pods on the node
    Private bandwidth out bps Total for private network outbound bandwidth of all pods on the node
    Public bandwidth in bps Total public network inbound bandwidth of all pods on the node
    Public bandwidth out bps Total public network outbound bandwidth of all pods on the node
    TCP Connections Count connections Number of TCP connections maintained on the node

    For more information on monitoring metrics for cluster nodes, please see Get Monitoring Statistics.

    For more information on monitoring metrics for cluster node data disks, please see Monitoring Cloud Disks.

    Monitoring Metrics for Workloads

    Monitoring Metric Unit Description
    Re-startup of Pods restarts Total for the number of restarts of all pods in the workload
    CPU Usage cores CPU usage of all pods in the workload
    CPU Utilization (% cluster) % CPU usage of all pods in the workload to the total CPU of the cluster
    MEM Usage B Memory usage of all pods in the workload
    MEM Utilization (% cluster) % Memory usage of all pods in the workload to the total memory of the cluster
    Network Inbound Bandwidth bps Total inbound bandwidth of all pods in the workload
    Network Outbound Bandwidth bps Total for outbound bandwidth of all pods in the workload
    Network Inbound Traffic B Total inbound traffic of all pods in the workload
    Network Traffic Out B Total outbound traffic of all pods in the workload
    Network Inbound Traffic packets/sec Total inbound packets of all pods in the workload
    Network Outbound Traffic packets/sec Total outbound packets of all pods in the workload

    If the workload provides services outside the cluster, please see Obtaining Monitoring Data for more information on network monitoring metrics for bound services.

    Monitoring Metrics for Pods

    Monitoring Metric Unit Description
    Exception - Pod status: normal or exceptional
    CPU Usage cores CPU usage of the pod
    CPU Utilization (% node) % CPU usage of the pod to the total CPU of the node
    CPU Utilization (% Request) % CPU usage of the pod to the Request valude
    CPU Utilization (% of Limit) % CPU usage of the pod to the Limit value
    MEM Usage B Memory usage of the pod, including cache
    MEM Usage (exclude cache) B Actual memory usage (not including cache) of all containers in the pod
    MEM Utilization (% node) % Memory usage of the pod to the total memory of the node
    MEM Utilization (% node, exclude cache) % Actual memory usage (not including cache) of all containers in the pod to the total memory of the node
    MEM Utilization (% Request) % Memory usage of the pod to the Request value
    MEM Utilization (% Request, exclude cache) % Actual memory usage (not including cache) of all containers in the pod to the Request value
    MEM Utilization (% of Limit) % Memory usage of the pod to the Limit value
    MEM Utilization (% limit, exclude cache) % Actual memory usage (not including cache) of all containers in the pod to the Limit value
    Network Inbound Bandwidth bps Total inbound bandwidth of the pod
    Network Outbound Bandwidth bps Total outbound bandwidth of the pod
    Network Inbound Traffic B Total inbound traffic of the pod
    Network Traffic Out B Total outbound traffic of the pod
    Network Inbound Traffic packets/sec Total inbound packets of the pod
    Network Outbound Traffic packets/sec Total outbound packets of the pod

    Monitoring Metrics for Containers

    Monitoring Metric Unit Description
    CPU Usage cores CPU usage of container
    CPU Utilization (% node) % CPU usage of the container to the total CPU of the node
    CPU Utilization (% Request) % CPU usage of the container to the Request value
    CPU Utilization (% Limit) % CPU usage of the container to the Limit value
    MEM Usage B Memory usage of the container, including cache
    MEM Usage (exclude cache) B Actual memory usage of the container (not including cache)
    MEM Utilization (% node) % Memory usage of the container to the total memory of the node
    MEM Utilization (% node, exclude cache) % Actual memory usage (not including cache) of the container to the total memory of the node
    MEM Utilization (% request) % Memory usage of the container to the Request value
    MEM Utilization (% Request, excl. cache) % Actual memory usage (not including cache) of the container to the Request value
    MEM Utilization (% of Limit) % Memory usage of the container to the Limit value
    MEM Utilization (% limit, exclude cache) % Actual memory usage (not including cache) of the container to the Limit value
    Block device read bandwidth B/sec Throughput of the container to read data from disk
    Block device write bandwidth B/sec Throughput of the container to write data to disk
    Read IOPS of Block Device operations/sec Number of times the container read from disk
    Write IOPS of Block Device operations/sec Number of times the container wrote to disk

    Alarms

    TKE currently provides alarm metrics of the following dimensions. All metrics are average values within the statistical period.

    Alarm Metrics for Clusters

    Monitoring Metric Unit Description
    CPU Utilization % CPU utilization rate of entire cluster
    MEM Utilization % Memory utilization rate of entire cluster
    CPU Allocation % Ratio of the sum of the set CPU Requests from all containers in the cluster to the cluster’s total allocable CPU resources
    MEM Allocation % Ratio of the sum of the set Requests from all containers in the cluster to the cluster’s total allocable memory resources
    Apiserver Normal - Apiserver status. By default, alarms when status value is False. Only self-deployed clusters support this metric.
    Etcd Normal - Etcd status. By default, alarms when status value is False. Only self-deployed clusters support this metric.
    Scheduler Normal - Scheduler status. By default, alarms when status value is False. Only self-deployed clusters support this metric.
    Control Manager Normal - Control Manager status. By default, alarms when status value is False. Only self-deployed clusters support this metric.

    Alarm Metrics for Nodes

    Monitoring Metric Unit Description
    CPU Utilization % CPU usage of all pods on the node to the total CPU of the node
    MEM Utilization % Memory usage of all pods on the node to the total memory of the node
    Re-startup of Pods on This Node Times Total number of restarts of all pods on the node
    Node Ready - Node status. By default, alarms when status value is False.

    For more information on alarm metrics for cluster nodes, please see Get Monitoring Statistics and Create Alarm.

    For more information on alarm metrics for cluster node data disks, please see Monitoring Cloud Disks and Create Alarm.

    Alarm Metrics for Pods

    Monitoring Metric Unit Description
    CPU Utilization (% node) % CPU usage of the pod to the total CPU of the node l
    MEM Utilization (% node) % Memory usage of the pod to the total memory of the node
    Actual MEM Utilization (% node, exclude cache) % Actual memory usage (exclude cache) of all containers in the pod to the total memory of the node
    CPU Utilization (% limit) % CPU usage of the pod to the Limit value
    MEM Utilization (% of Limit) % Memory usage of the pod to the Limit value
    Actual MEM Utilization (% of Limit, exclude cache) % Actual memory usage of the pod (exclude cache) to the Limit value
    Re-startup of Pods restarts Number of pod restarts
    Pod Ready - Pod status. By default, alarms when status value is False.
    CPU Usage cores CPU usage of the pod
    MEM Usage MB Memory usage of the pod, including cache
    Actual MEM Usage (exclude cache) MB Actual memory usage of all containers in the pod, excluding cache

    Was this page helpful?

    Was this page helpful?

    • Not at all
    • Not very helpful
    • Somewhat helpful
    • Very helpful
    • Extremely helpful
    Send Feedback
    Help