tencent cloud

Feedback

Monitoring Metric List

Last updated: 2023-11-07 18:09:05
    The monitoring metric list provides the meanings of all metrics, helping you use the monitoring feature in Stream Compute Service.

    Monitoring metric list

    Note
    You can view the following metrics in TCOP console > Stream Compute Service, and configure alarms here.
    Metric
    Description
    Example Value
    job_records_in_per_second
    The total number of records the job receives from all sources per second.
    22478.14 Record/s
    job_records_out_per_second
    The total number of records the job emits to all sinks per second.
    12017.09 Record/s
    job_bytes_in_per_second
    The total number of bytes the job receives from all sources (Kafka sources only) per second.
    786576 Byte/s
    job_bytes_out_per_second
    The total number of bytes the job emits to all sinks (Kafka sinks only) per second.
    156872 Byte/s
    job latency
    The total latency it takes the data to flow through all operators. Sample errors may exist, so the value is for reference only.
    275 ms
    job_service_delay
    The difference between the current timestamp and the watermark at the sink (if there are multiple sinks, the maximum difference is used).
    5432 ms
    job_cpu_load
    The average CPU utilization of all TaskManagers of the job.
    23.85%
    taskmanager_status_jvm_memory_heap_used_percentage
    The average heap memory utilization of all TaskManagers of the job.
    57.12%
    taskmanager_status_jvm_memory_heap_used
    The total heap memory used of all TaskManagers of the job.
    830897056.00 Bytes
    taskmanager_memory_heap_committed
    The total heap memory committed of all TaskManagers of the job.
    4937220096.00 Bytes
    taskmanager_memory_heap_max
    The total max heap memory of all TaskManagers of the job.
    4937220096.00 Bytes
    taskmanager_status_jvm_memory_nonheap_used
    The total non-heap memory (JVM metaspace and code cache) used of all TaskManagers of the job.
    296651064.00 Bytes
    taskmanager_memory_nonheap_committed
    The total non-heap memory (JVM metaspace and code cache) committed of all TaskManagers of the job.
    103219200.00 Bytes
    taskmanager_status_jvm_memory_nonheap_max
    The total max non-heap memory (JVM metaspace and code cache) of all TaskManagers of the job.
    780140544.00 Bytes
    taskmanager_status_jvm_memory_process_memoryused
    The max JVM memory (RSS) of all TaskManagers of the job, including heap, non-heap, native, and other areas. This metric is used to give an early warning for OOM Killed events in a Pod.
    3597035110.00 Bytes
    taskmanager_memory_direct_count
    The sum of buffers in the direct buffer pools of all TaskManagers of the job.
    10993.00 Items
    taskmanager_memory_direct_used
    The total direct buffer pools used of all TaskManagers of the job.
    360328431.00 Bytes
    taskmanager_memory_direct_max
    The total max direct buffer pools of all TaskManagers of the job.
    360328431.00 Bytes
    taskmanager_memory_mapped_count
    The sum of buffers in the mapped buffer pools of all TaskManagers of the job.
    4 Items
    taskmanager_memory_mapped_used
    The total mapped buffer pools used of all TaskManagers of the job.
    33554432.00 Bytes
    taskmanager_memory_mapped_max
    The total max mapped buffer pools of all TaskManagers of the job.
    33554432.00 Bytes
    jobmanager_jvm_old_gc_count
    The old GC count of the JobManager of the job.
    3.00 Times
    jobmanager_jvm_old_gc_time
    The old GC time of the JobManager of the job.
    701.00 ms
    jobmanager_jvm_young_gc_count
    The young GC count of the JobManager of the job.
    53.00 Times
    jobmanager_jvm_young_gc_time
    The young GC time of the JobManager of the job.
    4094.00 ms
    job_lastcheckpointduration
    The time taken to make the last checkpoint of the job.
    723.00 ms
    job_lastcheckpointsize
    The size of the last checkpoint of the job.
    751321.00 Bytes
    taskmanager_jvm_old_gc_count
    The sum of old GC counts of all TaskManagers of the job.
    9.00 Times
    taskmanager_jvm_old_gc_time
    The sum of old GC time of all TaskManagers of the job.
    2014.00 ms
    taskmanager_jvm_young_gc_count
    The sum of young GC counts of all TaskManagers of the job.
    889.00 Times
    taskmanager_jvm_young_gc_time
    The sum of young GC time of all TaskManagers of the job.
    15051.00 ms
    job_numberofcompletedcheckpoints
    The number of successful checkpoints of the job.
    11.00 Times
    job_numberoffailedcheckpoints
    The number of failed checkpoints of the job.
    1.00 Time
    job_numberofinprogresscheckpoints
    The number of checkpoints in progress (not completed) of the job.
    1.00 Time
    job_totalnumberofcheckpoints
    The total number of checkpoints (in progress, completed, and failed) of the job.
    13.00 Times
    job_numrecordsinbutfailed
    The number of failed records (such as raising various exceptions) in the operator. If its value is greater than 1, the semantics of Exactly-Once will be affected. It is a testing parameter for reference only.
    0.00 Times
    jobmanager_job_numrestarts
    The recorded number of job restarts due to crash (excluding restart of the job after the JobManager exits) of the JobManager of the job.
    10.00 Times
    jobmanager_status_jvm_memory_heap_used_percentage
    The heap memory utilization of the JobManager of the job.
    31.34%
    jobmanager_memory_heap_used
    The heap memory used of the JobManager of the job.
    1040001560.00 Bytes
    jobmanager_memory_heap_committed
    The heap memory committed of the JobManager of the job.
    3318218752.00 Bytes
    jobmanager_memory_heap_max
    The max heap memory of the JobManager of the job.
    3318218752.00 Bytes
    jobmanager_status_jvm_memory_nonheap_used
    The non-heap memory (JVM metaspace and code cache) used of the JobManager of the job.
    117362656.00 Bytes
    jobmanager_memory_nonheap_committed
    The non-heap memory (JVM metaspace and code cache) committed of the JobManager of the job.
    122183680.00 Bytes
    jobmanager_status_jvm_memory_nonheap_max
    The max non-heap memory (JVM metaspace and code cache) of the JobManager of the job.
    780140544.00 Bytes
    jobmanager_status_jvm_memory_used
    The JVM memory used (RSS) of the JobManager of the job, including heap, non-heap, native and other areas. This metric is used to give an early warning for OOM Killed events in a Pod.
    3597035110.00 Bytes
    jobmanager_cpu_load
    The CPU utilization of the JobManager of the job.
    7.12%
    jobmanager_cpu_time
    The CPU service time (ms) of the JobManager of the job.
    834490.00 ms
    jobmanager_downtime
    For a non-running (failed or recovering) job, the duration of this downtime; for a running job, the value of this metric is 0.
    1088466.00 ms
    job_uptime
    For a running job, the duration of continuous running of this job without interruption.
    202305.00 ms
    job_restartingtime
    The time taken for the last restart of the job.
    197181.00 ms
    jobmanager_lastcheckpointrestoretimestamp
    The Unix timestamp of the last job recovery from checkpoint (in ms), whose value will be -1 if no recovery is performed.
    1621934344137.00 ms
    jobmanager_memory_mapped_count
    The number of buffers in the mapped buffer pool of the JobManager of the job.
    4.00 Items
    jobmanager_memory_mapped_memoryused
    The mapped buffer pool used of the JobManager of the job.
    33554432.00 Bytes
    jobmanager_memory_mapped_totalcapacity
    The max mapped buffer pool of the JobManager of the job.
    33554432.00 Bytes
    jobmanager_memory_direct_count
    The number of buffers in the direct buffer pool of the JobManager of the job.
    22.00 Items
    jobmanager_memory_direct_memoryused
    The direct buffer pool used of the JobManager of the job.
    575767.00 Bytes
    jobmanager_memory_direct_totalcapacity
    The max direct buffer pool of the JobManager of the job.
    577814.00 Bytes
    jobmanager_numregisteredtaskmanagers
    The number of registered TaskManagers of the job, which is generally equal to the max operator parallelism. The decline in the number of TaskManagers indicates that some TaskManagers are disconnected, and the job may crash and try to recover.
    3.00 TaskManagers
    jobmanager_numrunningjobs
    The number of running jobs, with 1 for proper job running and 0 for job crash.
    1.00 Job
    jobmanager_taskslotsavailable
    The number of task slots available, with 0 for proper job running and a value other than 0 for possible non-running of the job for a short period of time.
    0.00 Slots
    jobmanager_taskslotstotal
    In Stream Compute Service, a TaskManager has only one task slot, so the total number of task slots is equal to the number of registered TaskManagers.
    3.00 Slots
    jobmanager_threads_count
    The number of active threads in the JobManager of the job, including daemon and non-daemon threads.
    77.00 Threads
    taskmanager_cpu_time
    The CPU service time (ms) of all TaskManagers of the job.
    2029230.00 ms
    taskmanager_network_availablememorysegments
    The sum of memory segments available in all TaskManagers of the job.
    32890.00 Items
    taskmanager_network_totalmemorysegments
    The sum of total memory segments assigned to all TaskManagers of the job.
    32931.00 Items
    taskmanager_threads_count
    The total number of active threads in all TaskManagers of the job, including daemon and non-daemon threads.
    207.00 Threads
    job_lastcheckpointsize
    The size of the last checkpoint.
    1,024 Bytes
    job_lastcheckpointduration
    The time taken to make the last checkpoint.
    100ms
    job_numberoffailedcheckpoints
    The number of failed checkpoints.
    50 Bytes
    JM CPU Load
    The JVM CPU utilization of the JobManager.
    12%
    JM Heap Memory
    The heap memory usage of the JobManager.
    50 Bytes
    JM GC Count
    Status.JVM.GarbageCollector.<GarbageCollector>.Count of the JobManager, representing the GC count of the JobManager.
    5 times
    JM GC Time
    Status.JVM.GarbageCollector.<GarbageCollector>.Time of the JobManager, representing the GC time of the JobManager.
    64ms
    TaskManager CPU Load
    The JVM CPU utilization of the selected TaskManager.
    70%
    TaskManager Heap Memory
    The heap memory usage of the selected TaskManager.
    50 bytes
    TaskManager GC Count
    Status.JVM.GarbageCollector.<GarbageCollector>.Count of the selected TaskManager, representing the GC count of the TaskManager.
    5 times
    TaskManager GC Time
    Status.JVM.GarbageCollector.<GarbageCollector>.Time of the selected TaskManager, representing the GC time of the TaskManager.
    5ms
    Task OutPoolUsage
    The percentage of output queues. When this metric reaches 100%, the task is backpressured.
    64%
    Task OutputQueueLength
    The number of output queues.
    6
    Task InPoolUsage
    The percentage of input queues. When this metric reaches 100%, the task is backpressured.
    64%
    Task InputQueueLength
    The number of input queues.
    6
    Task CurrentInputWatermark
    The current watermark of the task.
    1623814418
    Data import time (ETL)
    The delay of a source taking the data in the job.
    10 ms
    job_records_in_per_second ‍(ETL)
    The total rate of all sources in the job.
    342 Records/s
    SourceIdleTime (ETL)
    The interval between data batches processed by a source in the job, which indirectly reflects the idle time of the source.
    24532223 ms
    SynDelay (ETL)
    The delay of a source taking the data and processing it in the job.
    1345 ms
    BinLogPos (ETL)
    The MySQL binary log coordinates or PostgreSQL log sequence number (LSN) of the job.
    260690147
    job latency (ETL)
    The average delay between the sink and source operators of the job.
    49 ms
    DbFlushDelay (ETL)
    The sum of the database flush delay and async callback time of the job.
    30 ms
    job_records_out_per_second (ETL)
    The total rate of all sinks in the job.
    234 Records/s
    Source - full sync (ETL)
    The full data sync progress of the job.
    30%
    Source - incremental sync (ETL)
    For MySQL, sync delay refers to the gap between the binlog coordinates of the current source and the latest binlog coordinates of the MySQL instance ‍source ‍collected in the last sampling; for PostgreSQL, sync delay refers to the gap between the LSN of the current source and the latest LSN of the PostgreSQL instance source collected in the last sampling.
    205
    Kafka - records_lag max
    The maximum of kafka-lag-max (the difference of Kafka producer and consumer offsets) reported by the TaskManager.
    100
    Kafka - records_lag min
    The minimum of kafka-lag-max (the difference of Kafka producer and consumer offsets) reported by the TaskManager.
    50
    Kafka - records_lag mean
    The mean of kafka-lag-max (the difference of Kafka producer and consumer offsets) reported by the TaskManager.
    80
    Kafka - records_lag sum
    The sum of kafka-lag-max (the difference of Kafka producer and consumer offsets) reported by the TaskManager.
    500
    CurrentFetchEventtimeLag ‍ ‍(ms)
    Formula: FetchTime (the time the source fetches the data) − EventTime (data event time). This metric reflects the retention of data in the external system.
    10
    CurrentEmitEventtimeLag ‍(ms)
    Formula: EmitTime (the time the data leaves the source) − EventTime (data event time). This metric reflects the retention of data between the external system and the Source.
    20
    taskmanager_job_task_backpressuredtimemspersecond (%)
    The maximum of all subtask backpressure percentages in the job.
    30%
    taskmanager_job_task_dataskewcoefficient
    This metric is the coefficient of variation (= standard deviation/mean) of subtask inputs of each job. A value less than 10% represents a weak skew.
    10%
    
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support