TDMQ for RabbitMQ provides a comprehensive multi-dimensional observability system, covering core features such as basic monitoring and alarms, and change records. It also supports enhanced capabilities such as Prometheus monitoring and intelligent inspection, offering users comprehensive observability from resource metrics to business status, and from real-time monitoring to history rollback, ensuring stable business operations.
Monitoring and Alarms
Monitoring Capabilities
TDMQ for RabbitMQ provides resource monitoring and alarm capabilities based on the Tencent Cloud Observability Platform (TCOP) service by default. It enables real-time monitoring of resources created under your account, such as clusters, nodes, vhosts, queues, and exchanges. You can use the monitoring data to understand various cluster metrics, including business requests, resource usage, traffic, connections, and message backlog, to better assist in assessing cluster capacity levels and proactively detecting risks.
The following table describes the monitoring capabilities supported by TDMQ for RabbitMQ.
|
Basic monitoring | All clusters | Through basic monitoring, you can view monitoring metrics across five dimensions: clusters, nodes, vhosts, queues, and exchanges. | It is suitable for cluster-level metric observation, meeting the requirements of Ops scenarios, such as assisting in identifying issues and planning cluster capacity. |
Prometheus monitoring | Managed Edition clusters | Prometheus Exporter is provided to collect node monitoring metrics, including but not limited to basic monitoring metrics such as Queue, Channel, and Connection, as well as the metrics exposed by Broker JMX. | It provides an open-source compatible monitoring integration solution, supporting integration with users' self-built Ops platforms. |
Alarm Capabilities
TDMQ for RabbitMQ provides multiple monitoring metrics for the running resources to monitor the running status of clusters. It also offers the alarm configuration feature for key metrics. You can configure alarm rules for monitoring metrics. Based on the created alarm rules, the system compares the monitoring metrics against the specified thresholds over a certain period. If a monitoring metric reaches the preset alarm threshold, TCOP will notify you through emails, Short Message Service (SMS), WeChat, or phone calls. This allows you to take preventive or remedial actions in a timely manner. Proper configuration of alarm rules can help you enhance application robustness and reliability.
Intelligent Inspection (Supported Only by Managed Edition)
TDMQ for RabbitMQ introduces an intelligent inspection capability that proactively detects cluster issues and potential risks. Powered by an expert knowledge base, it provides solutions for detected problems and automatically summarizes health check results to generate reports. The intelligent inspection capability can extract key information for users, efficiently locate issues, and offer professional solutions and suggestions, achieving a closed-loop Ops experience.
Change Records (Supported Only by Managed Edition)
The change records feature centrally manages, stores, analyzes, and visualizes change event data generated by TDMQ for RabbitMQ, enabling future querying, auditing, and tracing. You can view details of change records in the change records section.