Troubleshooting

Last updated: 2020-01-06 10:40:29

    Overview

    Cloud Monitoring (CM) provides various methods to help you identify resource exceptions, and delivers exception information to you in real time through multiple channels.

    Locating Exceptions

    Detecting exceptions through monitoring alarms

    Through monitoring alarms, Tencent Cloud can promptly detect an exception and automatically inform you of it. This ensures that you can detect exception information in real time across all scenarios. You can log in to Cloud Monitoring Console and configure corresponding alarm policies for important resources. For more information, see Create an alarm.

    If you have configured important performance metrics and events as alarm rules, when an exception occurs, you and your system are immediately informed of this exception in multiple ways via the alarm channel.

    Alarm policies configured with an alarm recipient group will reach you via SMS messages or emails. Features such as repeated alarms and alarm convergence are also supported, keeping you informed of important alarms while avoiding sending unnecessary alarm notifications.

    You can also allow exception alarm information to be sent to your system by configuring the callback API feature in the alarm channel, which will allow you to further aggregate and process the exception alarm information.

    Detecting exceptions through monitoring views

    Through monitoring views, you can actively detect and locate exceptions based on average trends and historical data of performance metrics. By routinely inspecting monitoring views, you can discover exceptions with no configured alarms or exceptions that are hard to locate based on alarm rules. Compared to monitoring alarms, monitoring views can help you learn about the impact of exceptions on resources on a global scale. You can highlight resource exception information in various scenarios by subscribing important resources to the dashboard and properly configuring charts.

    For some instances, you can subscribe to instance details views to compare the trends of instance performance data on a dashboard.

    For resource clusters, you can subscribe to the aggregated data of a cluster to see the overall monitoring view of the cluster on the dashboard, and compare it with that of a single instance in this cluster. For more information, see Best practices for large-scale monitoring scenarios.

    By using the list sorting feature, you can locate the specific resources of any exception detected through a view and determine the impact of the exception for further troubleshooting.

    Troubleshooting

    Locating exception objects on the monitoring overview page

    When conducting routine inspection or when receiving an alarm message, you can log into the loud monitoring console and go to the Monitoring Overview page.

    1. On the Monitoring Overview page, locate Service Health Status in Last 24 Hours to view the resource exceptions of each region and project.
      You can browse recent exceptions by using the exception information overview feature.
    2. Click the number of exception objects to access the cloud product monitoring page.

      Abnormal resource objects are automatically filtered out on the cloud product monitoring page.
    3. Click the ID of a specific object to access the monitoring details page of the object, where details are provided to rewind the exception history and help locate exceptions.
      • Exception timeline allows you to view the current and historical information of the abnormal object, and helps you troubleshoot current exceptions based on historical alarms and status change information.
      • Resource performance monitoring data provides you with the most comprehensive resource performance data. You can perform a year-over-year or month-over-month comparison between the current data and historical data of the same metric, or compare data changes of different metrics within the same period for troubleshooting.

    Locating exception objects through dashboards

    Log into Cloud Monitoring Console. In the left sidebar, click Dashboard to access the dashboard management page.

    1. When you identify an abnormal trend in the monitoring view, click the time period when the exception occurs. A sorting list of corresponding instances is displayed below the chart. You can locate the specific abnormal objects based on the sorting list.

    2. Click the name of an object in the sorting list to access the monitoring details page of the object, where details are provided to rewind the exception history and help locate exceptions.

      • Exception timeline allows you to view the current and historical information of the abnormal object, and helps you troubleshoot current exceptions based on historical alarms and status change information.
      • Resource performance monitoring data provides you with the most comprehensive resource performance data. You can perform a year-over-year or month-over-month comparison between the current data and historical data of the same metric, or compare data changes of different metrics within the same period for troubleshooting.

    Was this page helpful?

    Was this page helpful?

    • Not at all
    • Not very helpful
    • Somewhat helpful
    • Very helpful
    • Extremely helpful
    Send Feedback
    Help