Collecting Full RegEx Logs

Last updated: 2020-02-24 14:58:58

PDF

Overview

Fully regular mode refers to the log parsing mode in which multiple key-value are extracted from a complete log in a regular way. If you do not need to extract key-value, see Full text in a single line collection or Full text in multi lines collection Configure it. To configure the fully regular mode, you need to enter the log sample first, and then customize the regular expression. The system will extract the corresponding key-value according to the capture group in the regular expression. The following will give you a detailed description of how to collect fully regular format logs.

Samples

Suppose the raw data of one of your logs is:

10.135.46.111 - - [22/Jan/2019:19:19:30 +0800] "GET /my/course/1 HTTP/1.1" 127.0.0.1 200 782 9703 "http://127.0.0.1/course/explore?filter%5Btype%5D=all&filter%5Bprice%5D=all&filter%5BcurrentLevelId%5D=all&orderBy=studentNum" "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:64.0) Gecko/20100101 Firefox/64.0"  0.354 0.354

The custom regular expressions configured are:

(\S+)[^\[]+(\[[^:]+:\d+:\d+:\d+\s\S+)\s"(\w+)\s(\S+)\s([^"]+)"\s(\S+)\s(\d+)\s(\d+)\s(\d+)\s"([^"]+)"\s"([^"]+)"\s+(\S+)\s(\S+).*

Then the log service will be based on the () Capture group extraction corresponding key-value, you can customize the key name of each group, as shown below.

body_bytes_sent: 9703
http_host: 127.0.0.1
http_protocol: HTTP/1.1
http_referer: http://127.0.0.1/course/explore?filter%5Btype%5D=all&filter%5Bprice%5D=all&filter%5BcurrentLevelId%5D=all&orderBy=studentNum
User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0ET
remote_addr: 10.135.46.111
request_length: 782
request_method: GET
request_time: 0.354
request_url: /my/course/1
"status": 1,
time_local: [22/Jan/2019:19:19:30 +0800]
upstream_response_time: 0.354

At present, full regularization only supports single-line log extraction key-value, while multi-line log extraction is not supported. If you need to collect multi-line logs, please refer to Full text in multi lines collection Document.

Collection Configuration

1. Log in to the console

(1) Log in to the CLS Console . Click Logset Management In the left sidebar to enter the logset management page.

2. Add Log Topic

Select the target Logset, click * * add Log topic * *, enter the log topic name test-whole, and click * * OK * * to add the log topic.

3. Enable LogListener

Click topic of the log collected by LogListener, click * * Edit * * in the upper right corner of the collection configuration interface, enter the editing mode, and enable * * Collection status * * and * * use LogListener * *.

4. Configure log file collection path

The format of log collection path is [Directory prefix expression] / * */ [File name expression] LogListener will follow the [Directory prefix expression] Match all the common prefix paths that conform to the rules, and listen for all matches under Directory (including sublayer Directory) [File name expression] The log file of the rule. The parameters are described as follows:

Field DESCRIPTION
Directory Prefix Log file prefix Directory structure, only wildcards are supported * and?, * means to match multiple arbitrary characters,? Means to match a single arbitrary character
/ ** / The current Directory and all his sons Directory
Filename Log file name, only wildcards are supported * and?, * means to match multiple arbitrary characters,? Means to match a single arbitrary character

Reference for common configuration modes:

[ Public Directory prefix] / * */ [ Public file name prefix] *
[ Public Directory prefix] / * */ * [ Public file name suffix]
[ Public Directory prefix] / * */ [ Public file name prefix] * [ Public file name suffix]
[ Public Directory prefix] / * */ * [ Public string] *

Enter example:

Serial No Directory prefix expression File name expression Description
1. / var/log/nginx Access.log In this example, the log path is configured to /var/log/nginx/**/access.log LogListener will be listening /var/log/nginx All children Directory under the prefix path access.log Named log file
2. / var/log/nginx * .log In this example, the log path is configured to /var/log/nginx/**/*.log LogListener will be listening /var/log/nginx All children Directory under the prefix path .log The log file at the end
3. / var/log/nginx Error * In this example, the log path is configured to /var/log/nginx/**/error* LogListener will be listening /var/log/nginx All children Directory under the prefix path error Log file named at the beginning
  1. Multi-layer Directory and wildcard configuration methods rely on version 2.2.2 or above of loglistener, to be compatible with lower version loglistener path configuration modification. Users can switch to the old configuration for historical modification, and the old collection path method does not support multi-Directory collection.
  2. A log file can only be collected by one log topic.
  3. LogListener does not support listening to log files in soft connection mode and log files on Directory, such as NFS, CIFS and other shared files.

5. Associate server group

Select the target Server group from the list of Server group and Associate with the current log topic. Server group of Associate and the region where the log topic is located need to be consistent. For more information, see How to create Server group .

6. Configure fully regular mode

[key extraction mode] Please select Full RegEx As shown in the following figure:

6.1 define regular expressions

In fully regular mode, the system will extract the key-value, according to the defined regular expression and click [automatically generate] to complete the regular extraction pattern. If it cannot be generated automatically, you can switch to manual regular mode validation.
Auto mode
a. Enter a sample log.
b. Select part of the log content according to the needs of retrieval and analysis, and click "grouping". After a successful grouping, it will be specially displayed, indicating that this part will be extracted as a key-value grouping.
c. Repeat step b above until all the key-value to be extracted is grouped.
d. Click "automatic Generation", and the system groups the logs into regular extraction patterns.

  • Manual mode
    a. Enter a sample log.
    b. Enter the regular extraction expression manually.
    c. Click "verify", and the system will determine whether the log sample matches the regular expression.

Whether in automatic mode or manual mode, once the regular extraction pattern is defined and verified, the extraction results will be displayed below. You need to define a key name for each set of key-value pairs, which will be used for log retrieval and analysis.

6. Configure collection time

Log time is measured in seconds.
The time attribute of a log is defined in two ways: collection time and original timestamp.
Collection time: The time attribute of a log is determined by the time when CLS collects the log.
Original timestamp: The time attribute of a log is determined by the timestamp in the original log.

7.1 Use the collection time as the time attribute of logs

You can keep the collection time on, as shown in the following figure:

7.2 Use the original timestamp as the time attribute of logs

Close the collection time state, in the time key and time format parsing, Enter's original timestamp time key and the corresponding time parsing format. For details of the time parsing format, see Configure time format .

The following example illustrates the time format resolution rule Enter:
Example 1: Original timestamp: 10/Dec/2017:08:00:00 , Parsing format: %d/%b/%Y:%H:%M:%S
Example 2: Original timestamp: 2017-12-10 08:00:00 , Parsing format: %Y-%m-%d %H:%M:%S
Example 3: Original timestamp: 12/10/2017, 08:00:00 , Parsing format: %m/%d/%Y, %H:%M:%S

Second can be used as the unit of log time. If the time is entered in a wrong format, the collection time is used as the log time.

8. Set filter condition

The filter is designed to add log collection and filtering rules according to your business needs to help you filter out valuable log data. The filtering rule is a Perl regular expression, and the created filtering rule is a hit rule, that is, the logs that match the regular expression will be collected and reported.

When you come to collect completely regular data, you need to configure the filtering rules according to the custom key-value pairs. For example, after the sample log is parsed in full regular mode, and you want all log data with a status field of 400,500 to be collected, then configure status, filtering rules at key. |500。

The relationship between multiple filtering rules is "and" logic; if multiple filtering rules are configured for the same key name, the rules will be overwritten.

9. Searches for logs

Login Log service console In Left sidebar, click "Log Retrieval", enter Logset and Log topic, and click "search" to start retrieving logs according to the set query conditions.

Index configuration must be turned on for retrieval, otherwise it cannot be retrieved.