tencent cloud

Cloud Log Service

Collection Overview

Download
Focus Mode
Font Size
Last updated: 2026-06-10 21:54:13

Overview

CLS provides multiple collection methods, including LogListener, API, and SDK, enabling users to collect logs from various data sources and import them into CLS.

Features

Log source access
Different log sources support different access modes. See the table below:
Collection methods
CLS provides multiple methods for data collection:
Collection Method
Description
API
You can call CLS APIs to upload structured logs to CLS. For more information, see Uploading Logs via APIs.
SDK Collection
You can use an SDK to upload structured logs to CLS. For more information, see Uploading Logs via SDK.
kafka protocol collection
You can use the Kafka protocol to upload logs to CLS. For more information, see Uploading Logs via Kafka Protocol.
LogListener client collection
LogListener is a log collection client provided by CLS. You can quickly connect CLS by simply configuring LogListener in the console. For more information, see LogListener Use Process.
A comparison of the collection methods is as follows:
Category
Collection via LogListener
API/SDK Collection
Code modification
Non-intrusive, no code modification required.
Requires application code modification.
Resumable transfer
Supports resumable transfer of logs.
Implemented in custom code.
Retry mechanism
Built-in retry mechanism.
Implemented in custom code.
Local cache
Supports local cache and ensures data integrity during traffic peaks.
Implemented in custom code.
Resource consumption
Uses resources such as memory and CPU.
No additional resource consumption.

Log Structuring and Parsing

The structuring of logs is to store your log data on the CLS platform in key-value format. Structured logs can be searched for, analyzed, and shipped based on specified keys. CLS allows you to report structured data directly. For more information, see the following:
For example, a local raw log is as follows:
10.20.20.10;[Tue Jan 22 14:49:45 CST 2019 +0800];GET /online/sample HTTP/1.1;127.0.0.1;200;647;35;http://127.0.0.1/
Specify that the log is parsed by separator and select semicolon (;) as the separator. Then the log can be parsed into multiple field groups and each group is organized in key-value pairs. A key name is defined for each key as follows:
IP: 10.20.20.10
time: [Tue Jan 22 14:49:45 CST 2019 +0800]
request: GET /online/sample HTTP/1.1
host: 127.0.0.1
status: 200
length: 647
bytes: 35
referer: http://127.0.0.1/
LogListener supports multiple parsing methods, as shown in the following table:
Single-Line Full-Text Format
Multi-line Full-Text Format
Single-Line Full Regular Expression Format
Multi-line Full Regular Expression Format
JSON Format
Delimiter
Combined Parsing
Nginx Log Templates
A single-line full-text log refers to a log where each line represents a complete log entry. When collecting logs, CLS uses the line break \\n as the delimiter to mark the end of each log entry. For unified structured management, each log will have a default key-value pair __CONTENT__. However, the log data itself will not be processed in a structured manner, nor will log fields be extracted. The time attribute of a log is determined by the time when the log is collected.
Assume that the raw data of a log is:
Tue Jan 22 12:08:15 CST 2019 Installed: libjpeg-turbo-static-1.2.90-6.el7.x86_64
The data collected into CLS is:
__CONTENT__:Tue Jan 22 12:08:15 CST 2019 Installed: libjpeg-turbo-static-1.2.90-6.el7.x86_64
A multi-line full-text log represents that the content of a complete log may span multiple lines, for example, Java stack trace. In this case, it is somewhat unreasonable to use the line break \\n as the end identifier of the log. In order to enable the logging system to clearly distinguish each log, the first-line regular expression method is adopted for matching. If a certain line of a log matches the preset regular expression, it is considered as the beginning of the log, and the beginning of the next line appears as the end identifier of the log.
A multi-line full-text log will also have a default key-value pair __CONTENT__. However, the log data itself will not be processed in a structured manner, nor will log fields be extracted. The time attribute of a log is determined by the time when the log is collected.
Assume that the raw data of a multi-line log is:
2019-12-15 17:13:06,043 [main] ERROR com.test.logging.FooFactory:
java.lang.NullPointerException
at com.test.logging.FooFactory.createFoo(FooFactory.java:15)
at com.test.logging.FooFactoryTest.test(FooFactoryTest.java:11)
The first-line regular expression is as follows:
\\d{4}-\\d{2}-\\d{2}\\s\\d{2}:\\d{2}:\\d{2},\\d{3}\\s.+
The data collected into CLS is:
__CONTENT__:2019-12-15 17:13:06,043 [main] ERROR com.test.logging.FooFactory:\\njava.lang.NullPointerException\\n at com.test.logging.FooFactory.createFoo(FooFactory.java:15)\\n at com.test.logging.FooFactoryTest.test(FooFactoryTest.java:11)
The single-line full regular expression format is usually used to process structured logs. This represents a log parsing mode in which multiple key-value pairs are extracted from a complete log entry using regular expressions.
Assume that the raw data of a log is:
10.135.46.111 - - [22/Jan/2019:19:19:30 +0800] "GET /my/course/1 HTTP/1.1" 127.0.0.1 200 782 9703 "http://127.0.0.1/course/explore?filter%5Btype%5D=all&filter%5Bprice%5D=all&filter%5BcurrentLevelId%5D=all&orderBy=studentNum" "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:64.0) Gecko/20100101 Firefox/64.0" 0.354 0.354
The configured regular expression is as follows:
(\\S+)[^\\[]+(\\[[^:]+:\\d+:\\d+:\\d+\\s\\S+)\\s"(\\w+)\\s(\\S+)\\s([^"]+)"\\s(\\S+)\\s(\\d+)\\s(\\d+)\\s(\\d+)\\s"([^"]+)"\\s"([^"]+)"\\s+(\\S+)\\s(\\S+).*
The data collected into CLS is:
body_bytes_sent: 9703
http_host: 127.0.0.1
http_protocol: HTTP/1.1
http_referer: http://127.0.0.1/course/explore?filter%5Btype%5D=all&filter%5Bprice%5D=all&filter%5BcurrentLevelId%5D=all&orderBy=studentNum
http_user_agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:64.0) Gecko/20100101 Firefox/64.0
remote_addr: 10.135.46.111
request_length: 782
request_method: GET
request_time: 0.354
request_url: /my/course/1
status: 200
time_local: [22/Jan/2019:19:19:30 +0800]
upstream_response_time: 0.354
Assume that the raw data of a log is:
[2018-10-01T10:30:01,000] [INFO] java.lang.Exception: exception happened
at TestPrintStackTrace.f(TestPrintStackTrace.java:3)
at TestPrintStackTrace.g(TestPrintStackTrace.java:7)
at TestPrintStackTrace.main(TestPrintStackTrace.java:16)
The first-line regular expression is:
\\[\\d+-\\d+-\\w+:\\d+:\\d+,\\d+]\\s\\[\\w+]\\s.*
The configured custom regular expression is:
\\[(\\d+-\\d+-\\w+:\\d+:\\d+,\\d+)\\]\\s\\[(\\w+)\\]\\s(.*)
After the system extracts the corresponding key-value pair based on the () capture group, you can customize the key name of each group as follows:
time: 2018-10-01T10:30:01,000`
level: INFO`
msg:java.lang.Exception: exception happened
at TestPrintStackTrace.f(TestPrintStackTrace.java:3)
at TestPrintStackTrace.g(TestPrintStackTrace.java:7)
at TestPrintStackTrace.main(TestPrintStackTrace.java:16)
Assume that the raw data of a JSON log is:
{"remote_ip":"10.135.46.111","time_local":"22/Jan/2019:19:19:34 +0800","body_sent":23,"responsetime":0.232,"upstreamtime":"0.232","upstreamhost":"unix:/tmp/php-cgi.sock","http_host":"127.0.0.1","method":"POST","url":"/event/dispatch","request":"POST /event/dispatch HTTP/1.1","xff":"-","referer":"http://127.0.0.1/my/course/4","agent":"Mozilla/5.0 (Windows NT 10.0; WOW64; rv:64.0) Gecko/20100101 Firefox/64.0","response_code":"200"}
After being structured by CLS, the log becomes:
agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:64.0) Gecko/20100101 Firefox/64.0
body_sent: 23
http_host: 127.0.0.1
method: POST
referer: http://127.0.0.1/my/course/4
remote_ip: 10.135.46.111
request: POST /event/dispatch HTTP/1.1
response_code: 200
responsetime: 0.232
time_local: 22/Jan/2019:19:19:34 +0800
upstreamhost: unix:/tmp/php-cgi.sock
upstreamtime: 0.232
url: /event/dispatch
xff: -
Assume that the raw data of a log is:
10.20.20.10 - ::: [Tue Jan 22 14:49:45 CST 2019 +0800] ::: GET /online/sample HTTP/1.1 ::: 127.0.0.1 ::: 200 ::: 647 ::: 35 ::: http://127.0.0.1/
When the delimiter for log parsing is specified as :::, this log will be divided into eight fields, and each of these fields will be assigned a unique key, as shown below:
IP: 10.20.20.10 -
bytes: 35
host: 127.0.0.1
length: 647
referer: http://127.0.0.1/
request: GET /online/sample HTTP/1.1
status: 200
time: [Tue Jan 22 14:49:45 CST 2019 +0800]
Assume that the raw data of a log is:
1571394459,http://127.0.0.1/my/course/4|10.135.46.111|200,status:DEAD,
The custom extension content is as follows:
{
"processors": [
{
"type": "processor_split_delimiter",
"detail": {
"Delimiter": ",",
"ExtractKeys": [ "time", "msg1","msg2"]
},
"processors": [
{
"type": "processor_timeformat",
"detail": {
"KeepSource": true,
"TimeFormat": "%s",
"SourceKey": "time"
}
},
{
"type": "processor_split_delimiter",
"detail": {
"KeepSource": false,
"Delimiter": "|",
"SourceKey": "msg1",
"ExtractKeys": [ "submsg1","submsg2","submsg3"]
},
"processors": []
},
{
"type": "processor_split_key_value",
"detail": {
"KeepSource": false,
"Delimiter": ":",
"SourceKey": "msg2"
}
}
]
}
]
}
After being structured by CLS, the log becomes:
time: 1571394459
submsg1: http://127.0.0.1/my/course/4
submsg2: 10.135.46.111
submsg3: 200
status: DEAD
Assume that the raw data of a log is:
111.121.0.1 - - [06/Aug/2019:12:12:19 +0800] "GET /nginx-logo.png HTTP/1.1" 200 368 "http://119.x.x.x/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36" "-"
The Nginx log configuration is as follows:
log_format main '$remote_addr - $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent"';
After the Nginx log configuration is processed, the following regular expression is generated:
(\\S+)\\s*-\\s*(\\S+)\\s*\\[(\\d+\\S+\\d+:\\d+:\\d+:\\d+)\\s+\\S+\\]\\s*\\"(\\S+)\\s+(\\S+)\\s+\\S+\\"\\s*(\\S+)\\s*(\\S+)\\s*\\"([^"]*)\\"\\s*\\"([^"]*)\\".*
The data collected into CLS is:
remote_addr: 111.121.0.1
remote_user: -
time_local: 06/Aug/2019:12:12:19
request_method: GET
request_uri: /nginx-logo.png
status: 200
body_bytes_sent: 368
http_referer: http://119.x.x.x/
http_user_agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36

Fee Instructions

Log collection incurs log write traffic fees. For more information, see Billing Overview.

Specifications and Limitations

For the specifications and limitations of log collection, see:

Help and Support

Was this page helpful?

Help us improve! Rate your documentation experience in 5 mins.

Feedback