Configuring Alarm

Last updated: 2021-07-14 18:51:15

    Overview

    Tencent Cloud provides the Cloud Monitor service for all users by default; therefore, you do not need to manually activate it. Cloud Monitor will start collecting monitoring data only after a Tencent Cloud product is used.

    CKafka allows you to monitor the resources created under your account, including instances, topics, and consumer groups, so that you can keep track of the status of your resources in real time. You can configure alarm rules for monitoring metrics. When a monitoring metric reaches the set alarm threshold, Cloud Monitor will notify you of exceptions in time via email, SMS, WeChat, phone, etc.

    Directions

    Configuring alarm policy

    The created alarm can determine whether an alarm notification should be sent based on the comparison between the monitoring metric and the given threshold in the selected time period. You can take appropriate precautionary or remedial measures in a timely manner when the alarm is triggered because the status of CKafka changes. Properly creating alarm policies can help improve the robustness and reliability of your applications.

    Note:

    Be sure to configure alarms for your instance to prevent exceptions caused by a sudden traffic spike or specification limits.

    1. Log in to the CKafka console.

    2. In the instance list, click Configure Alarms in the Operation column to go to the alarm configuration page.

    3. On the alarm configuration page, select a policy type and instance and set the alarm rule and notification template.

      • Policy Type: select CKafka.

      • Alarm Object: select the CKafka resource for which to configure the alarm policy.

      • Trigger Condition: you can select Select template or Configure manually. The latter is selected by default. For more information on manual configuration, please see the description below. For more information on how to create a template, please see Creating trigger condition template.

        • Metric: for example, if you select 1 minute as the statistical period for the "Disk Utilization" metric, then if the disk utilization exceeds the threshold for N consecutive data points, an alarm will be triggered.
        • Alarm Frequency: for example, "Alarm once every 30 minutes" means that there will be only one alarm triggered every 30 minutes if a metric exceeds the threshold in several consecutive statistical periods. Another alarm will be triggered only if the metric exceeds the threshold again in the next 30 minutes.

        For metrics for which configuring an alarm policy is recommended, please see Monitoring Alarm Policies Recommended for CKafka.

      • Notification Template: you can select an existing notification template or create one to set the alarm recipient objects and receiving channels.

    4. Click Complete to complete the configuration.

    For more information on alarms, please see Creating Alarm Policy.

    Creating trigger condition template

    1. Log in to the Cloud Monitor console.
    2. On the left sidebar, click Trigger Condition Template to enter the trigger template list page.
    3. Click Create on the trigger template page.
    4. On the template creation page, configure the policy type.
      • Policy Type: select CKafka.
      • Use preset trigger condition: select this option and the system recommended alarm policy will be displayed.
    5. After confirming that everything is correct, click Save.
    6. Return to alarm policy creation page and click Refresh. The alarm policy template just configured will be displayed.

    For more information on metrics that may affect the business data stability, please see CKafka Data Reliability.

    Based on user feedback, we recommend you configure alarm policies in the following 3 dimensions (6 metrics in total) for CKafka, but you should configure them reasonably based on your actual business conditions.

    Instance monitoring:

    Monitoring Metric Description
    Production Peak Bandwidth (MB/s) Peak traffic generated when the instance produces messages (excluding the traffic generated by replicas).
    Consumption Peak Bandwidth (MB/s) Peak traffic generated when the instance consumes messages (there is no replica concept in consumption).
    Disk Utilization (%) Ratio of the currently used disk capacity to the total disk capacity of the instance in percentages.
    Number of Connections Number of connections between the client and server.

    Topic monitoring:

    Monitoring Metric Description
    Used Disk Capacity (MB) Total size of messages in the topic (excluding those produced by replicas) that actually use disk capacity, which is the latest value in the selected time granularity.

    Consumer group:

    Monitoring Metric Description
    Number of Unconsumed Messages Number of unconsumed messages in the consumer group.