Collecting Full RegEx Logs
Last updated: 2020-02-24 14:58:58PDF
Fully regular mode refers to the log parsing mode in which multiple key-value are extracted from a complete log in a regular way. If you do not need to extract key-value, see Full text in a single line collection or Full text in multi lines collection Configure it. To configure the fully regular mode, you need to enter the log sample first, and then customize the regular expression. The system will extract the corresponding key-value according to the capture group in the regular expression. The following will give you a detailed description of how to collect fully regular format logs.
Suppose the raw data of one of your logs is:
10.135.46.111 - - [22/Jan/2019:19:19:30 +0800] "GET /my/course/1 HTTP/1.1" 127.0.0.1 200 782 9703 "http://127.0.0.1/course/explore?filter%5Btype%5D=all&filter%5Bprice%5D=all&filter%5BcurrentLevelId%5D=all&orderBy=studentNum" "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:64.0) Gecko/20100101 Firefox/64.0" 0.354 0.354
The custom regular expressions configured are:
Then the log service will be based on the
() Capture group extraction corresponding key-value, you can customize the key name of each group, as shown below.
body_bytes_sent: 9703 http_host: 127.0.0.1 http_protocol: HTTP/1.1 http_referer: http://127.0.0.1/course/explore?filter%5Btype%5D=all&filter%5Bprice%5D=all&filter%5BcurrentLevelId%5D=all&orderBy=studentNum User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0ET remote_addr: 10.135.46.111 request_length: 782 request_method: GET request_time: 0.354 request_url: /my/course/1 "status": 1, time_local: [22/Jan/2019:19:19:30 +0800] upstream_response_time: 0.354
At present, full regularization only supports single-line log extraction key-value, while multi-line log extraction is not supported. If you need to collect multi-line logs, please refer to Full text in multi lines collection Document.
1. Log in to the console
(1) Log in to the CLS Console . Click Logset Management In the left sidebar to enter the logset management page.
2. Add Log Topic
Select the target Logset, click * * add Log topic * *, enter the log topic name test-whole, and click * * OK * * to add the log topic.
3. Enable LogListener
Click topic of the log collected by LogListener, click * * Edit * * in the upper right corner of the collection configuration interface, enter the editing mode, and enable * * Collection status * * and * * use LogListener * *.
4. Configure log file collection path
The format of log collection path is [Directory prefix expression] / * */ [File name expression] LogListener will follow the [Directory prefix expression] Match all the common prefix paths that conform to the rules, and listen for all matches under Directory (including sublayer Directory) [File name expression] The log file of the rule. The parameters are described as follows:
|Directory Prefix||Log file prefix Directory structure, only wildcards are supported * and?, * means to match multiple arbitrary characters,? Means to match a single arbitrary character|
|/ ** /||The current Directory and all his sons Directory|
|Filename||Log file name, only wildcards are supported * and?, * means to match multiple arbitrary characters,? Means to match a single arbitrary character|
Reference for common configuration modes:
[ Public Directory prefix] / * */ [ Public file name prefix] *
[ Public Directory prefix] / * */ * [ Public file name suffix]
[ Public Directory prefix] / * */ [ Public file name prefix] * [ Public file name suffix]
[ Public Directory prefix] / * */ * [ Public string] *
|Serial No||Directory prefix expression||File name expression||Description|
|1.||/ var/log/nginx||Access.log||In this example, the log path is configured to
|2.||/ var/log/nginx||* .log||In this example, the log path is configured to
|3.||/ var/log/nginx||Error *||In this example, the log path is configured to
- Multi-layer Directory and wildcard configuration methods rely on version 2.2.2 or above of loglistener, to be compatible with lower version loglistener path configuration modification. Users can switch to the old configuration for historical modification, and the old collection path method does not support multi-Directory collection.
- A log file can only be collected by one log topic.
- LogListener does not support listening to log files in soft connection mode and log files on Directory, such as NFS, CIFS and other shared files.
5. Associate server group
Select the target Server group from the list of Server group and Associate with the current log topic. Server group of Associate and the region where the log topic is located need to be consistent. For more information, see How to create Server group .
6. Configure fully regular mode
[key extraction mode] Please select Full RegEx As shown in the following figure:
6.1 define regular expressions
In fully regular mode, the system will extract the key-value, according to the defined regular expression and click [automatically generate] to complete the regular extraction pattern. If it cannot be generated automatically, you can switch to manual regular mode validation.
a. Enter a sample log.
b. Select part of the log content according to the needs of retrieval and analysis, and click "grouping". After a successful grouping, it will be specially displayed, indicating that this part will be extracted as a key-value grouping.
c. Repeat step b above until all the key-value to be extracted is grouped.
d. Click "automatic Generation", and the system groups the logs into regular extraction patterns.
- Manual mode
a. Enter a sample log.
b. Enter the regular extraction expression manually.
c. Click "verify", and the system will determine whether the log sample matches the regular expression.
Whether in automatic mode or manual mode, once the regular extraction pattern is defined and verified, the extraction results will be displayed below. You need to define a key name for each set of key-value pairs, which will be used for log retrieval and analysis.
6. Configure collection time
Log time is measured in seconds.
The time attribute of a log is defined in two ways: collection time and original timestamp.
Collection time: The time attribute of a log is determined by the time when CLS collects the log.
Original timestamp: The time attribute of a log is determined by the timestamp in the original log.
7.1 Use the collection time as the time attribute of logs
You can keep the collection time on, as shown in the following figure:
7.2 Use the original timestamp as the time attribute of logs
Close the collection time state, in the time key and time format parsing, Enter's original timestamp time key and the corresponding time parsing format. For details of the time parsing format, see Configure time format .
The following example illustrates the time format resolution rule Enter:
Example 1: Original timestamp:
10/Dec/2017:08:00:00 , Parsing format:
Example 2: Original timestamp:
2017-12-10 08:00:00 , Parsing format:
Example 3: Original timestamp:
12/10/2017, 08:00:00 , Parsing format:
Second can be used as the unit of log time. If the time is entered in a wrong format, the collection time is used as the log time.
8. Set filter condition
The filter is designed to add log collection and filtering rules according to your business needs to help you filter out valuable log data. The filtering rule is a Perl regular expression, and the created filtering rule is a hit rule, that is, the logs that match the regular expression will be collected and reported.
When you come to collect completely regular data, you need to configure the filtering rules according to the custom key-value pairs. For example, after the sample log is parsed in full regular mode, and you want all log data with a status field of 400,500 to be collected, then configure status, filtering rules at key. |500。
The relationship between multiple filtering rules is "and" logic; if multiple filtering rules are configured for the same key name, the rules will be overwritten.
9. Searches for logs
Login Log service console In Left sidebar, click "Log Retrieval", enter Logset and Log topic, and click "search" to start retrieving logs according to the set query conditions.
Index configuration must be turned on for retrieval, otherwise it cannot be retrieved.