tencent cloud

Feedback

Kudu Monitoring Metrics

Last updated: 2022-05-16 12:45:12

    Kudu - overview

    Title Metric Unit Description
    Tablets TabletRunning - Total number of tablets currently running on all tablet servers
    Difference in the number of tablet replicas ClusterReplicaSkew - Difference between the number of replicas on the tablet server hosting the most replicas and the number of replicas on the tablet server hosting the fewest replicas
    TServer threads ThreadsRunning - Number of threads currently running on all tablet servers
    Master threads ThreadsRunning - Number of threads currently running on all masters
    TServer logs ErrorMessages - Number of ERROR-level log messages emitted in all processes
    Master logs ErrorMessages - Number of ERROR-level log messages emitted in all processes
    WarningMessages - Number of WARNING-level log messages emitted in all processes
    Oversized write requests OversizedWriteRequests - Number of oversized write requests to the system catalog tablet rejected by the master since start

    Kudu - server

    Title Metric Unit Description
    Block cache hit BlockCacheHit - Number of block cache hits. When confirming the cache efficiency, use the value of this metric instead of that of cache_hits
    BlockCacheMiss - Number of block cache misses. When confirming the cache efficiency, use the value of this metric instead of that of cache_misses
    Block cache utilization BlockCacheUsage bytes Memory used by block cache
    File cache hit FileCacheHit - Number of file descriptor cache hits. When confirming the cache efficiency, use the value of this metric instead of that of cache_hits
    FileCacheMiss - Number of file descriptor cache misses. When confirming the cache efficiency, use the value of this metric instead of that of cache_misses
    File cache utilization FileCacheUsage - Number of entries file cache
    Scanner ActiveScanners - Number of currently active scanners
    ExpiredScanners - Number of scanners that have expired due to inactivity since service start
    Block manager blocks BlockUnderManagement - Number of currently managed data blocks
    BlockOpenReading - Number of data blocks currently opened for read
    BlockOpenWriting - Number of data blocks currently opened for write
    Block manager bytes BytesUnderManagement bytes Number of bytes of currently managed data blocks
    Block manager containers ContainersUnderManagement - Number of log block containers
    FullContainersUnderManagement - Number of full log block containers
    Tablet leaders NumRaftLeaders - Number of tablet replicas that are Raft leaders
    Tablet sessions OpenClientSessions - Number of currently opened tablet copy client sessions on this server
    OpemSourceSessions - Number of currently opened tablet copy source sessions on this server
    Tablets TabletBootstrapping - Number of currently bootstrapping tablets
    TabletFailed - Number of failed tablets
    TabletInitialized - Number of currently initialized tablets
    TabletNotInitialized - Number of currently uninitialized tablets
    TabletRunning - Number of currently running tablets/Number of currently running threads
    TabletShutdown - Number of currently shut down tablets
    TabletStopped - Number of currently stopped tablets
    TabletStopping - Number of currently stopping tablets
    CPU time CpuStime ms Total system CPU time of process
    CpuUtime ms Total user CPU time of process
    Data path DataDirsFailed - Number of data directories whose disks are currently in failed status
    DataDirsFull - Number of data directories whose disks are currently full
    Thread ThreadsRunning - Number of currently running threads
    Context InvoluntarySwitches - Total involuntary context switches
    VoluntarySwitches - Total voluntary context switches
    Spinlock SpinlockContentionTime μs Amount of time consumed by contention on internal spinlocks since server start
    Log information ErrorMessages - Number of ERROR-level log messages emitted by the application
    WarningMessages - Number of WARNING-level log messages emitted by the application
    Operations in queue TotalCount - Total number
    Min - Minimum number of tasks waiting in the queue
    Max - Maximum number of tasks waiting in the queue
    Mean - Average number of tasks waiting in the queue
    Percentile_99_9 - 99.9th percentile of the number of tasks waiting in the queue
    Operation execution duration TotalCount μs Total number of operations
    Min μs Minimum run time
    Max μs Maximum run time
    Mean μs Average run time
    Percentile_99_9 μs 99.9th percentile of the run time
    Queuing wait time TotalCount μs Total number of operations
    Min μs Minimum wait time
    Max μs Maximum wait time
    Mean μs Average wait time
    Percentile_99_9 μs 99.9th percentile of the wait time
    Allocated bytes AllocatedBytes bytes Number of bytes used by applications. This usually does not match the memory usage reported by the operating system because it does not include TCMalloc overhead or memory fragments
    Hybrid clock error HybridClockError μs Server clock maximum error; returns 2^64-1 when unable to read the base clock
    Hybrid clock timestamp HybridClockTimestamp μs Hybrid clock timestamp; returns 2^64-1 when unable to read the base clock
    TCMalloc memory HeapSize bytes Bytes of system memory reserved by TCMalloc
    CurrentThreadCacheBytes bytes A measure of some of the memory TCMalloc is using (for small objects)
    TotalThreadCacheBytes bytes A limit to how much memory TCMalloc dedicates for small objects
    TCMalloc PageHeap FreeBytes bytes Number of bytes of free mapped pages in the page heap
    UnMappedBytes bytes Number of bytes of free unmapped pages in the page heap
    RPC request ConnectionsAccepted - Number of incoming TCP connections made to the RPC server
    QueueOverflow - Number of RPCs dropped because the service queue was full
    TimesOutInQueue - Number of RPCs that timed out while waiting in the service queue and thus were not processed
    RPC FetchData TotalCount μs Total number of operations
    Min μs Minimum processing time
    Max μs Maximum processing time
    Mean μs Average processing time
    Percentile_99_9 μs 99.9th percentile of the processing time
    RPC AlterSchema TotalCount μs Total number of operations
    Min μs Minimum processing time
    Max μs Maximum processing time
    Mean μs Average processing time
    Percentile_99_9 μs 99.9th percentile of the processing time
    RPC CreateTablet TotalCount μs Total number of operations
    Min μs Minimum processing time
    Max μs Maximum processing time
    Mean μs Average processing time
    Percentile_99_9 μs 99.9th percentile of the processing time
    RPC DeleteTablet TotalCount μs Total number of operations
    Min μs Minimum processing time
    Max μs Maximum processing time
    Mean μs Average processing time
    Percentile_99_9 μs 99.9th percentile of the processing time
    RPC Quiesce TotalCount μs Total number of operations
    Min μs Minimum processing time
    Max μs Maximum processing time
    Mean μs Average processing time
    Percentile_99_9 μs 99.9th percentile of the processing time
    RPC scan TotalCount μs Total number of operations
    Min μs Minimum processing time
    Max μs Maximum processing time
    Mean μs Average processing time
    Percentile_99_9 μs 99.9th percentile of the processing time
    RPC ScannerKeepAlive TotalCount μs Total number of operations
    Min μs Minimum processing time
    Max μs Maximum processing time
    Mean μs Average processing time
    Percentile_99_9 μs 99.9th percentile of the processing time
    RPC write TotalCount μs Total number of operations
    Min μs Minimum processing time
    Max μs Maximum processing time
    Mean μs Average processing time
    Percentile_99_9 μs 99.9th percentile of the processing time
    Write requests rejected due to queue overloading QueueOverloadRejections count Number of write requests rejected due to queue overloading
    Scan rate ScannedFromDiskRate bytes/s Amount of data scanned per second
    ScannerReturnedRate bytes/s Amount of data returned per second
    Scanner bytes ScannedFromDisk bytes Total amount of data scanned from disk
    ScannerReturned bytes Total amount of returned data
    Total row operations RowsInserted count Number of rows inserted into the node
    RowsDeleted count Number of rows deleted from the node
    RowsUpserted count Number of rows upserted into the node
    RowsUpdated count Number of rows updated on the node
    Row operation rate RowsInsertedRate count/s Number of rows inserted into the node per second
    RowsDeletedRate count/s Number of rows deleted from the node per second
    RowsUpsertedRate count/s Number of rows upserted into the node per second
    RowsUpdatedRate count/s Number of rows updated on the node per second

    Kudu - master

    Title Metric Unit Description
    Block cache hit BlockCacheHit - Number of block cache hits. When confirming the cache efficiency, use the value of this metric instead of that of cache_hits
    BlockCacheMiss - Number of block cache misses. When confirming the cache efficiency, use the value of this metric instead of that of cache_misses
    Block cache utilization BlockCacheUsage bytes Memory used by block cache
    File cache hit FileCacheHit - Number of file descriptor cache hits. When confirming the cache efficiency, use the value of this metric instead of that of cache_hits
    FileCacheMiss - Number of file descriptor cache misses. When confirming the cache efficiency, use the value of this metric instead of that of cache_misses
    File cache utilization FileCacheUsage - Number of entries file cache
    Block manager blocks BlockUnderManagement - Number of currently managed data blocks
    BlockOpenReading - Number of data blocks currently opened for read
    BlockOpenWriting - Number of data blocks currently opened for write
    Block manager bytes BytesUnderManagement bytes Number of bytes of currently managed data blocks
    Block manager containers ContainersUnderManagement - Number of log block containers
    FullContainersUnderManagement - Number of full log block containers
    CPU time CpuStime ms Total system CPU time of process
    CpuUtime ms Total user CPU time of process
    Thread ThreadsRunning - Number of currently running threads
    Data path DataDirsFailed - Number of data directories whose disks are currently in failed status
    DataDirsFull - Number of data directories whose disks are currently full
    Allocated bytes AllocatedBytes bytes Number of bytes used by applications. This usually does not match the memory usage reported by the operating system because it does not include TCMalloc overhead or memory fragments
    Log information ErrorMessages - Number of ERROR-level log messages emitted by the application
    WarningMessages - Number of WARNING-level log messages emitted by the application
    Context InvoluntarySwitches - Total involuntary context switches
    VoluntarySwitches - Total voluntary context switches
    Operations in queue TotalCount - Total number
    Min - Minimum number of tasks waiting in the queue
    Max - Maximum number of tasks waiting in the queue
    Mean - Average number of tasks waiting in the queue
    Percentile_99_9 - 99.9th percentile of the number of tasks waiting in the queue
    Queuing wait time TotalCount μs Total number of operations
    Min μs Minimum wait time
    Max μs Maximum wait time
    Mean μs Average wait time
    Percentile_99_9 μs 99.9th percentile of the wait time
    Operation execution duration TotalCount μs Total number of operations
    Min μs Minimum run time
    Max μs Maximum run time
    Mean μs Average run time
    Percentile_99_9 μs 99.9th percentile of the run time
    Spinlock SpinlockContentionTime μs Amount of time consumed by contention on internal spinlocks since server start
    Oversized write requests OversizedWriteRequests - Number of oversized write requests to the system catalog tablet rejected since start
    Hybrid clock error HybridClockError μs Server clock maximum error; returns 2^64-1 when unable to read the base clock
    Hybrid clock timestamp HybridClockTimestamp μs Hybrid clock timestamp; returns 2^64-1 when unable to read the base clock
    Difference in the number of tablet replicas ClusterReplicaSkew - Difference between the number of replicas on the tablet server hosting the most replicas and the number of replicas on the tablet server hosting the fewest replicas
    Tablet leaders NumRaftLeaders - Number of tablet replicas that are Raft leaders
    Tablet sessions OpemSourceSessions - Number of currently opened tablet copy source sessions on this server
    TCMalloc memory HeapSize bytes Bytes of system memory reserved by TCMalloc
    CurrentThreadCacheBytes bytes A measure of some of the memory TCMalloc is using (for small objects)
    TotalThreadCacheBytes bytes A limit to how much memory TCMalloc dedicates for small objects
    TCMalloc page heap FreeBytes bytes Number of bytes of free mapped pages in the page heap
    UnMappedBytes bytes Number of bytes of free unmapped pages in the page heap
    RPC request ConnectionsAccepted - Number of incoming TCP connections made to the RPC server
    QueueOverflow - Number of RPCs dropped because the service queue was full
    TimesOutInQueue - Number of RPCs that timed out while waiting in the service queue and thus were not processed
    RPC RunLeaderElection TotalCount μs Total number of operations
    Min μs Minimum processing time
    Max μs Maximum processing time
    Mean μs Average processing time
    Percentile_99_9 μs 99.9th percentile of the processing time
    RPC ConnectToMaster TotalCount μs Total number of operations
    Min μs Minimum processing time
    Max μs Maximum processing time
    Mean μs Average processing time
    Percentile_99_9 μs 99.9th percentile of the processing time
    RPC Ping TotalCount μs Total number of operations
    Min μs Minimum processing time
    Max μs Maximum processing time
    Mean μs Average processing time
    Percentile_99_9 μs 99.9th percentile of the processing time
    RPC TSHeartbeat TotalCount μs Total number of operations
    Min μs Minimum processing time
    Max μs Maximum processing time
    Mean μs Average processing time
    Percentile_99_9 μs 99.9th percentile of the processing time
    RPC FetchData TotalCount μs Total number of operations
    Min μs Minimum processing time
    Max μs Maximum processing time
    Mean μs Average processing time
    Percentile_99_9 μs 99.9th percentile of the processing time
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support