tencent cloud

TDMQ for CKafka

Release Notes and Announcements
Release Notes
Broker Release Notes
Announcement
Product Introduction
Introduction and Selection of the TDMQ Product Series
What Is TDMQ for CKafka
Strengths
Scenarios
Technology Architecture
Product Series Introduction
Apache Kafka Version Support Description
Comparison with Apache Kafka
High Availability
Use Limits
Regions and AZs
Related Cloud Services
Billing
Billing Overview
Pricing
Billing Example
Changing from Postpaid by Hour to Monthly Subscription
Renewal
Viewing Consumption Details
Overdue Payments
Refund
Getting Started
Guide for Getting Started
Preparations
VPC Network Access
Public Domain Name Access
User Guide
Usage Process Guide
Configuring Account Permission
Creating Instance
Configuring Topic
Connecting Instance
Managing Messages
Managing Consumer Group
Managing Instance
Changing Instance Specification
Configuring Traffic Throttling
Configuring Elastic Scaling Policy
Configuring Advanced Features
Viewing Monitoring Data and Configuring Alarm Rules
Synchronizing Data Using CKafka Connector
Use Cases
Cluster Resource Assessment
Client Practical Tutorial
Log Integration
Open-Source Ecosystem Integration
Replacing Supporting Route (Old)
Migration Guide
Migration Solution Overview
Migrating Cluster Using Open-Source Tool
Troubleshooting
Topics
Clients
Messages
​​API Reference
History
Introduction
API Category
Making API Requests
Other APIs
ACL APIs
Instance APIs
Routing APIs
DataHub APIs
Topic APIs
Data Types
Error Codes
SDK Reference
SDK Overview
Java SDK
Python SDK
Go SDK
PHP SDK
C++ SDK
Node.js SDK
SDK for Connector
Security and Compliance
Permission Management
Network Security
Deletion Protection
Event Record
CloudAudit
FAQs
Instances
Topics
Consumer Groups
Client-Related
Network-Related
Monitoring
Messages
Agreements
CKafka Service Level Agreements
Contact Us
Glossary

Overview of Monitoring and Alarm Capabilities

PDF
フォーカスモード
フォントサイズ
最終更新日: 2026-01-20 17:02:40
TDMQ for CKafka (CKafka) provides a comprehensive observability system, including monitoring and alarms, event recording, and one-click diagnosis. It helps customers quickly detect, locate, and resolve issues to ensure stable business operations.

Monitoring and Alarms

Monitoring Capabilities

CKafka provides monitoring capabilities of cloud products based on Tencent Cloud Observability Platform (TCOP), enabling real-time monitoring of resources created under your account, such as instances, topics, and consumer groups. These monitoring metrics can help you understand cluster resource usage, the number of connections, and message backlogs, and assist you in determining the cluster capacity usage level and identifying risks in advance.

Based on the instance edition you purchased, the scope of monitoring capabilities supported by CKafka is as follows:
Type
Applicable Edition
Capability Description
Scenario
Basic monitoring
Full series
Through basic monitoring, you can view monitoring metrics from three dimensions, including instances, topics, and consumer groups.
Cluster-level metric observation, used for requirements such as assisting in identifying issues and planning cluster capacity in basic Ops scenarios.
Advanced monitoring
Pro Edition
Through advanced monitoring, you can view the node-level monitoring metrics of the instance, such as core services, production, consumption, instance resources, and broker GC.
Node-level metric observation, used for requirements such as issue localization, analysis of traffic throttling causes, and duration analysis in business troubleshooting scenarios.
Dashboard
Pro Edition
Through the dashboard, you can view the number of all TCP connections on the broker, details of out-of-sync replicas (OSRs), node distribution for topics, and top ranking data for key metrics such as topic traffic, disk usage, and the consumption speed of consumer groups.
Top ranking for key metrics, used for requirements such as assisting in production/consumption hot spot analysis, and disk usage analysis in business optimization analysis scenarios.
Prometheus monitoring
Pro Edition
It provides an access method based on the open-source standard Prometheus Exporter, including a series of monitorable metrics from Apache Kafka, such as instance-level metrics and node-level metrics.
It provides an open-source and compatible monitoring integration solution, supporting integration with users' self-built Ops platforms.

Alarm Capabilities

CKafka provides alarm capabilities for cloud products based on TCOP. You can configure alarm rules for monitoring metrics on TCOP. If a monitoring metric reaches the configured alarm threshold, you will be notified through emails, Short Message Service (SMS), WeChat, or phone calls. You can take preventive or remedial actions promptly. Proper configuration of alarm rules can help enhance the robustness and reliability of your applications.

Event Records

Event Center in CKafka supports centralized management, storage, analysis, and visualization of various Ops events, diagnosis events, and broker change events that occur during instance operation, facilitating future querying, auditing, and tracing. It also supports event alarm capabilities. You can configure alarm rules for key events (such as node offline or disk expansion failure) on TCOP, so that Ops personnel can handle them promptly.

One-Click Diagnosis

CKafka Pro Edition supports the one-click diagnosis feature, which can actively troubleshoot cluster risks and potential hazards. Based on the accumulated Tencent Cloud expert experience, it provides solutions for issues, automatically summarizes health check results, and generates diagnosis reports. The one-click diagnosis capability extracts key information, locates issues, and provides professional solutions and suggestions for users, achieving a closed-loop Ops experience.


ヘルプとサポート

この記事はお役に立ちましたか?

フィードバック