Monitoring at Five-Second Granularity

Last updated: 2021-06-25 14:48:22

    TencentDB for Redis provides a complete and easy-to-use monitoring service where you don’t have to worry about, for example, collecting monitoring data or OPS of the monitoring system. The monitoring service includes Proxy monitoring, Redis monitoring, and instance monitoring which summarizes the monitoring data of an entire instance.

    • Proxy monitoring: provides monitoring information of all Proxy nodes in an instance. TencentDB for Redis instances in standard or cluster architecture have Proxy nodes.
    • Redis monitoring: provides monitoring information of master and replica Redis nodes.
    • Instance monitoring: summarizes the monitoring data of an entire instance (including Proxy nodes and Proxy nodes) and aggregates data according to the SUM, AVG, MAX, and LAST aggregation algorithms.

      Description of the Five-second Monitoring Granularity

    • By default, new instances (excluding CKV instances) support five-second granularity.
    • In the future, you can modify the monitoring granularity of existing instances from one minute to five seconds in the TencentDB for Redis console. Please pay attention to the notices and pop-up notifications in the console.
    • In the Cloud Monitor console, alarm policies for five-second granularity are of a different policy type from those for one-minute granularity. You can replicate the alarm policies for one-minute granularity and change their policy types to Memory Edition (5-second granularity), so that these alarm policies can be associated with new instances supporting five-second granularity.

      Viewing Instance Monitoring Granularity

    • Log in to the TencentDB for Redis console, click an instance ID and enter the instance management page, select System Monitoring > Monitoring Metrics, and click the Period drop-down list at the top. If you can select 5 seconds from the drop-down list, this instance supports the monitoring granularity of five seconds, or else it supports only the monitoring granularity of one minute.
    • Check the value of the InstanceSet.MonitorVersion field returned by the DescribeInstances API. If the value is 5s, this instance supports the monitoring granularity of five seconds; if the value is 1m, it supports only the monitoring granularity of one minute.

    Monitoring Granularity and Monitoring Data Retention Period

    TencentDB for Redis currently supports monitoring metrics at the five-second, one-minute, five-minutes, one-hour, or one-day granularity. For the retention period of monitoring data at each granularity, please see Use Limits.

    Viewing Monitoring Information

    You can view TencentDB for Redis monitoring information in the instance list and on the instance monitoring page in the TencentDB for Redis console, or in the Cloud Monitor console.

    • Instance list: log in to the TencentDB for Redis console, click the View Monitoring icon in the instance list as shown below, and view monitoring metrics in the pop-up window on the right.
    • Instance monitoring page: log in to the TencentDB for Redis console, click an instance ID in the instance list and enter the instance management page, select System Monitoring, and view monitoring data on the Monitoring Metrics tab.
    • Cloud Monitor console: log in to the Cloud Monitor console to view the summary of monitoring data.

    Monitoring Metric Description

    Proxy node monitoring

    Each TencentDB for Redis instance contains at least 3 Proxy nodes. Generally, the number of Proxy nodes is 1.5 times that of Redis nodes. The Proxy node supports the following monitoring metrics:

    CategoryMetricParameterUnitDescription
    CPUCPU utilizationcpu_util%Proxy CPU utilization
    RequestTotal requestsproxy_commandsrequests/secondThe number of proxy command executions per second
    Key requestscmd_key_countkeys/secondThe number of keys accessed by a command per second
    Mget requestscmd_mgetrequests/secondThe number of Mget command executions per second
    Execution errorscmd_errerrors/secondThe number of Proxy command execution errors per second. For example, the command does not exist, parameters are incorrect, etc.
    Big value requestscmd_big_valuerequests/secondThe number of executions of requests larger than 32 KB per second
    NetworkConnectionsconnections-The number of TCP connections to an instance
    Connection utilizationconnections_util%The ratio of the number of TCP connections to the maximum number of connections
    Inbound trafficin_flowMB/sPrivate inbound traffic
    Inbound traffic utilizationin_bandwidth_util%The ratio of the actually used private inbound traffic to the maximum traffic
    Inbound traffic limit countin_flow_limit-The number of times inbound traffic triggers a traffic limit
    Outbound trafficout_flowMB/sPrivate outbound traffic
    Outbound traffic utilizationout_bandwidth_util%The ratio of the actually used private outbound traffic to the maximum traffic
    Outbound traffic limit countout_flow_limit-The number of times outbound traffic triggers a traffic limit
    LatencyAverage execution latencylatency_avgmsThe average execution latency between the proxy and the Redis server
    Maximum execution latencylatency_maxmsThe maximum execution latency between the proxy and the Redis server
    Average read latencylatency_readmsThe average execution latency of read commands between the proxy and the Redis server. For more information about read command types, please see Command types.
    Average write latencylatency_writemsThe average execution latency of write commands between the proxy and the Redis server. For more information about write command types, please see Command types.
    Average latency of other commandslatency_othermsThe average execution latency of commands (excluding write and read commands) between the proxy and the Redis server

    Redis node monitoring

    The Redis node monitoring includes monitoring information of all master nodes and replica nodes in an instance or a cluster. The following monitoring metrics are supported:

    CategoryMetricParameterUnitDescription
    CPUCPU utilizationcpu_util%Average CPU utilization
    NetworkConnectionsconnections-The number of connections between the proxy and a node
    Connection utilizationconnections_util%The connection utilization of a node
    MemoryUsed memorymem_usedMBActually used memory capacity, including the capacity for data and cache
    Memory utilizationmem_util%The ratio of the actually used memory to the requested total memory
    Total keyskeys-The total number of keys (level-1 keys) in instance storage
    Expired keysexpired-The number of keys expired in a time window, which is equal to the value of `expired_keys` outputted by the `info` command
    Evicted keysevicted-The number of keys evicted in a time window, which is equal to the value of `evicted_keys` outputted by the `info` command
    Replication delayrepl_delayByteThe command delay between the replica node and the master node
    RequestTotal requestscommandsrequests/secondQPS, that is, the number of command executions per second
    Read requestscmd_readrequests/secondThe number of read command executions per second. For more information about read command types, please see Command types.
    Write requestscmd_writerequests/secondThe number of write command executions per second. For more information about write command types, please see Command types.
    Other requestscmd_otherrequests/secondThe number of command (excluding write and read commands) executions per second
    ResponseSlow queriescmd_slow-The number of command executions with a latency greater than the `slowlog\-log\-slower\-than` configuration
    Read request hitscmd_hits-The number of keys successfully requested by read commands, which is equal to the value of the `keyspace_hits` metric output by the `info` command
    Read request missescmd_miss-The number of keys unsuccessfully requested by read commands, which is equal to the value of the `keyspace_misses` metric output by the `info` command
    Read request hit ratecmd_hits_ratio%Key hits/(Key hits + Key misses). This metric reflects cache misses.

    Instance monitoring

    The instance monitoring includes all monitoring data of an instance, including the monitoring data of Proxy nodes and Redis nodes, which is aggregated by the SUM, AVG, MAX, and LAST algorithms.

    CategoryMetricAssociated Node ViewParameterUnitDescription
    CPUCPU utilizationRedis nodecpu_util%Average CPU utilization
    Maximum node CPU utilizationRedis nodecpu_max_util%The maximum among all node (shard or replica) CPU utilizations in an instance
    MemoryUsed memoryRedis nodemem_usedMBActually used memory capacity, including the capacity for data and cache
    Memory utilizationRedis nodemem_util%The ratio of the actually used memory to the requested total memory
    Maximum node memory utilizationRedis nodemem_max_util%The maximum among all node (shard or replica) memory utilizations in an instance
    Total keysRedis nodekeys-The total number of keys (level-1 keys) in instance storage
    Expired keysRedis nodeexpired-The number of keys expired in a time window, which is equal to the value of `expired_keys` outputted by the `info` command
    Evicted keysRedis nodeevicted-The number of keys evicted in a time window, which is equal to the value of `evicted_keys` outputted by the `info` command
    NetworkConnectionsProxy nodeconnections-The number of TCP connections to an instance
    Connection utilizationProxy nodeconnections_util%The ratio of the number of TCP connections to the maximum number of connections
    Inbound trafficProxy nodein_flowMB/sPrivate inbound traffic
    Inbound traffic utilizationProxy nodein_bandwidth_util%The ratio of the actually used private inbound traffic to the maximum traffic
    Inbound traffic limit countProxy nodein_flow_limit-The number of times inbound traffic triggers a traffic limit
    Outbound trafficProxy nodeout_flowMB/sPrivate outbound traffic
    Outbound traffic utilizationProxy nodeout_bandwidth_util%The ratio of the actually used private outbound traffic to the maximum traffic
    Outbound traffic limit countProxy nodeout_flow_limit-The number of times outbound traffic triggers a traffic limit
    Average execution latencyProxy nodelatency_avgmsThe average execution latency between the proxy and the Redis server
    Maximum execution latencyProxy nodelatency_maxmsThe maximum execution latency between the proxy and the Redis server
    Average read latencyProxy nodelatency_readmsThe average execution latency of read commands between the proxy and the Redis server. For more information about read command types, please see Command types.
    Average write latencyProxy nodelatency_writemsThe average execution latency of write commands between the proxy and the Redis server. For more information about write command types, please see Command types.
    Average latency of other commandsProxy nodelatency_othermsThe average execution latency of commands (excluding write and read commands) between the proxy and the Redis server
    RequestTotal requestsRedis nodecommandsrequests/secondQPS, that is, the number of command executions per second
    Read requestsRedis nodecmd_readrequests/second The number of read command executions per second. For more information about read command types, please see Command types.
    Write requestsRedis nodecmd_writerequests/secondThe number of write command executions per second. For more information about write command types, please see Command types.
    Other requestsRedis nodecmd_otherrequests/secondThe number of command (excluding write and read commands) executions per second
    Big value requestsProxy nodecmd_big_valuerequests/secondThe number of executions of commands larger than 32 KB per second
    Key requestsProxy nodecmd_key_countkeys/secondThe number of keys accessed by a command per second
    Mget requestsProxy nodecmd_mgetrequests/secondThe number of Mget command executions per second
    Slow queries Redis nodecmd_slow-The number of command executions with a latency greater than the `slowlog\-log\-slower\-than` configuration
    Read request hitsRedis nodecmd_hits-The number of keys successfully requested by read commands, which is equal to the value of the `keyspace_hits` metric output by the `info` command
    Read request missesRedis nodecmd_miss-The number of keys unsuccessfully requested by read commands, which is equal to the value of the `keyspace_misses` metric output by the `info` command
    Execution errorsProxy nodecmd_err-The number of command execution errors. For example, the command does not exist, parameters are incorrect, etc.
    Read request hit rateRedis nodecmd_hits_ratio%Key hits/(Key hits + Key misses). This metric reflects cache misses.

    Command types

    Type Commands
    Read command get, strlen, exists, getbit, getrange, substr, mget, llen, lindex, lrange, sismember, scard, srandmember,
    sinter, sunion, sdiff, smembers, sscan, zrange, zrangebyscore, zrevrangebyscore, zrangebylex,
    zrevrangebylex, zcount, zlexcount, zrevrange, zcard, zscore, zrank, zrevrank, zscan, hget, hmget,
    hlen, hstrlen, hkeys, hvals, hgetall, hexists, hscan, randomkey, keys, scan, dbsize, type, ttl, touch, pttl,
    dump, object, memory, bitcount, bitpos, georadius_ro, georadiusbymember_ro, geohash, geopos, geodist, pfcount
    Write command set, setnx, setex, psetex, append, del, unlink, setbit, bitfield, setrange, incr, decr, rpush, lpush, rpushx,
    lpushx, linsert, rpop, lpop, brpop, brpoplpush, blpop, lset, ltrim, lrem, rpoplpush, sadd, srem, smove, spop,
    sinterstore, sunionstore, sdiffstore, zadd, zincrby, zrem, zremrangebyscore, zremrangebyrank,
    zremrangebylex, zunionstore, zinterstore, hset, hsetnx, hmset, hincrby, hincrbyfloat, hdel, incrby, decrby,
    incrbyfloat, getset, mset, msetnx, swapdb, move, rename, renamenx, expire, expireat, pexpire, pexpireat,
    flushdb, flushall, sort, persist, restore, restore-asking, migrate, bitop, geoadd, georadius, georadiusbymember,
    pfadd, pfmerge, pfdebug

    Querying Node Information

    Use the DescribeInstanceNodeInfo API to get the IDs of Proxy nodes and Redis nodes.

    Note:

    The IDs of Proxy and Redis nodes will change when node failover, instance capacity expansion/reduction, data migration, etc., occur. Therefore, we recommend that you get the latest node information from the API in a timely manner.