The single-line - full regular expression mode is a log parsing mode where multiple key-value pairs can be extracted from each log in a log text file in which each line is a raw log based on a regular expression. If you don't need to extract key-value pairs, please configure it as instructed in Collecting Logs with Full Text in a Single Line.
When configuring the single-line - full regular expression mode, you need to enter a sample log first and then customize your regular expression. After the configuration is completed, the system will extract the corresponding key-value pairs according to the capture group in the regular expression.
This document describes how to collect logs in single-line - full regular expression mode.
Assume the raw data of a log is:
10.135.46.111 - - [22/Jan/2019:19:19:30 +0800] "GET /my/course/1 HTTP/1.1" 127.0.0.1 200 782 9703 "http://127.0.0.1/course/explore?filter%5Btype%5D=all&filter%5Bprice%5D=all&filter%5BcurrentLevelId%5D=all&orderBy=studentNum" "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:64.0) Gecko/20100101 Firefox/64.0" 0.354 0.354
The custom regex you configure is:
Then CLS extracts key-value pairs based on the
() capture groups. You can specify the key name of each group as shown below:
body_bytes_sent: 9703 http_host: 127.0.0.1 http_protocol: HTTP/1.1 http_referer: http://127.0.0.1/course/explore?filter%5Btype%5D=all&filter%5Bprice%5D=all&filter%5BcurrentLevelId%5D=all&orderBy=studentNum http_user_agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:64.0) Gecko/20100101 Firefox/64.0 remote_addr: 10.135.46.111 request_length: 782 request_method: GET request_time: 0.354 request_url: /my/course/1 status: 200 time_local: upstream_response_time: 0.354
test-wholeas Log Topic Name and click OK.
On the Collection Configuration page, set Collection Path according to the log collection path format as shown below:
Log collection path format:
[directory prefix expression]/**/[filename expression].
After the log collection path is entered, LogListener will match all common prefix paths that meet the [directory prefix expression] rule and listen for all log files in the directories (including subdirectories) that meet the [filename expression] rule. The parameters are described as follows:
|Directory Prefix||Directory prefix for log files, which supports only the wildcard characters
|/**/||Current directory and all its subdirectories.|
|File Name||Log file name, which supports only the wildcard characters
Common configuration modes are as follows:
Below are examples:
|No.||Directory Prefix Expression||Filename Expression||Description|
|1.||/var/log/nginx||access.log||In this example, the log path is configured as
|2.||/var/log/nginx||*.log||In this example, the log path is configured as
|3.||/var/log/nginx||error*||In this example, the log path is configured as
- Only LogListener 2.3.9 and above support adding multiple collection paths.
- The system does not support uploading logs with contents in multiple text formats, which may cause write failures, such as
- You are advised to configure the collection path as
log/*.logand rename the old file after log rotation as
- By default, a log file can only be collected by one log topic. If you want to have multiple collection configurations for the same file, please add a soft link to the source file and add it to another collection configuration.
On the Collection Configuration page, set Extraction Mode to Single-line - Full regular expression and enter a sample log in the Log Sample text box.
Define a regular expression according to the following rules.
The system offers two ways to define a regular expression: manual mode and auto mode. You can manually enter the expression to extract key-value pairs for verification or click Auto-Generate Regular Expression to switch to auto mode. The system will extract key-value pairs to verify the regular expression according to the mode you selected and the regular expression you defined.
Auto mode (click Auto-Generate Regular Expression to switch):
No matter whether in auto mode or manual mode, the extraction result will be displayed in the Extraction Result after the regular mode is defined and verified successfully. You only need to define the key name of each key-value pair for use in log search and analysis.
Log time is measured in milliseconds.
The time attribute of a log is defined as follows:
Collection time: it is the default time attribute of a log.
Original timestamp: set Use Collection Time to and enter the time key of the original timestamp and the corresponding time parsing format.
For more information on time resolution formats, please see Configuring the Time Format.
Collection time: the time attribute of a log is determined by the time when CLS collects the log.
Original timestamp: the time attribute of a log is determined by the timestamp in the raw log.
Below are examples of how to enter a time resolution format:
The log time is measured in milliseconds. If the log time is entered in an incorrect format, the collection time is used as the log time.
Filters are designed to help you extract valuable log data by adding log collection filter rules based on your business needs. If the filter rule is a Perl regular expression, the created filter rule will be used for matching; in other words, only logs that match the regular expression will be collected and reported.
To collect logs in full regular expression mode, you need to configure a filter rule according to the defined custom key-value pair. For example, if you want to collect all log data with a
status field whose value is 400 or 500 after the sample log is parsed in full regular expression mode, you need to configure
status and the filter rule as
The relationship between multiple filter rules is logic "AND". If multiple filter rules are configured for the same key name, previous rules will be overwritten.
Index configuration must be enabled before you can perform searches.