tencent cloud

Feedback

Combined Parsing Format

Last updated: 2022-03-08 13:19:20

    Overview

    If your log structure is too complex and involves multiple log parsing modes, and a single parsing mode (such as the NGINX mode, full regex mode, or JSON mode) cannot meet log parsing requirements, you can use LogListener to parse logs in combined parsing mode. You can enter code (in JSON format) in the console to define the pipeline logic for log parsing. You can add one or more LogListener plugins to process configurations, and the LogListener plugins are executed in the configuration processing order.

    Prerequisites

    Assume that the raw data of a log is as follows:

    1571394459,http://127.0.0.1/my/course/4|10.135.46.111|200,status:DEAD,
    

    The content of a custom plugin is as follows:

    {
    "processors": [
     {
       "type": "processor_split_delimiter",
       "detail": {
         "Delimiter": ",",
         "ExtractKeys": [ "time", "msg1","msg2"]
       },
       "processors": [
         {
           "type": "processor_timeformat",
           "detail": {
             "KeepSource": true,
             "TimeFormat": "%s",
             "SourceKey": "time"
            }
         },
         {
           "type": "processor_split_delimiter",
           "detail": {
             "KeepSource": false,
             "Delimiter": "|",
             "SourceKey": "msg1",
             "ExtractKeys": [ "submsg1","submsg2","submsg3"]
            },
            "processors": []
         },
         {
           "type": "processor_split_key_value",
           "detail": {
             "KeepSource": false,
             "Delimiter": ":",
             "SourceKey": "msg2"
           }
         }
       ]
     }
    ]
    }
    

    After being structured by CLS, the log is changed to the following:

    time: 1571394459
    submsg1: http://127.0.0.1/my/course/4
    submsg2: 10.135.46.111
    submsg3: 200
    status: DEAD
    

    Configuration Instructions

    Custom plugin types

    Plugin Feature Plugin Name Feature Description
    Field extraction processor_log_string Performs multi-character (line breaks) parsing of fields, typically for single-line logs.
    Field extraction processor_multiline Performs first-line regex parsing of fields (full regex mode), typically for multi-line logs.
    Field extraction processor_multiline_fullregex Performs first-line regex parsing of fields (full regex mode), typically for multi-line logs; extracts regexes from multi-line logs.
    Field extraction processor_fullregex Extracts fields (full regex mode) from single-line logs.
    Field extraction processor_json Expands field values in JSON format.
    Field extraction processor_split_delimiter Extracts fields (single-/multi-character separator mode).
    Field extraction processor_split_key_value Extracts fields (key-value pair mode).
    Field processing processor_drop Discards fields.
    Field processing processor_timeformat Parses time fields in raw logs to convert time formats and set parsing results as log time.

    Custom plugin parameters

    Plugin Name Support Subitem Parsing Plugin Parameter Required Feature Description
    processor_multiline No BeginRegex Yes Defines the first-line matching regex for multi-line logs.
    processor_multiline_fullregex Yes

    BeginRegex

    YesDefines the first-line matching regex for multi-line logs.
    ExtractRegex Yes Defines the extraction regex after multi-line logs are extracted.
    ExtractKeys Yes Defines the extraction keys.
    processor_fullregex Yes ExtractRegex Yes Defines the extraction regex.
    ExtractKeys Yes Defines the extraction keys.
    processor_json Yes SourceKey No Defines the name of the upper-level processor key processed by the current processor.
    KeepSource No Defines whether to retain `SourceKey` in the final key name.
    processor_split_delimiter Yes SourceKey No Defines the name of the upper-level processor key processed by the current processor.
    KeepSource No Defines whether to retain `SourceKey` in the final key name.
    Delimiter Yes Defines the separator (single or multiple characters).
    ExtractKeys Yes Defines the extraction keys after separator splitting.
    processor_split_key_value No SourceKey No Defines the name of the upper-level processor key processed by the current processor.
    KeepSource No Defines whether to retain `SourceKey` in the final key name.
    Delimiter Yes Defines the separator between the `Key` and `Value` in a string.
    processor_drop No SourceKey Yes Defines the name of the upper-level processor key processed by the current processor.
    processor_timeformat No SourceKey Yes Defines the name of the upper-level processor key processed by the current processor.
    TimeFormat Yes Defines the time parsing format for the `SourceKey` value (time data string in logs).

    Directions

    Logging In to Console

    1. Log in to the CLS console.
    2. On the left sidebar, click Log Topic to go to the log topic management page.

    Creating a log topic

    1. Click Create Log Topic.
    2. In the pop-up dialog box, enter define-log as Log Topic Name and click Confirm. See the figure below.

    Managing the machine group

    1. After the log topic is created successfully, click its name to go to the log topic management page.
    2. Click the Collection Configuration tab, click Add in the LogListener Collection Configuration area, and select the format in which you need to collect logs.
    3. On the Machine Group Management page, select the machine group to which to bind the current log topic and click Next to proceed to collection configuration. See the figure below.
      For more information, please see Machine Group Management.

    Configuring collection

    Configuring the log file collection path

    On the Collection Configuration page, set Collection Path according to the log collection path format as shown below:
    Log collection path format: [directory prefix expression]/**/[filename expression].

    After the log collection path is entered, LogListener will match all common prefix paths that meet the [directory prefix expression] rule and listen for all log files in the directories (including subdirectories) that meet the [filename expression] rule. The parameters are as detailed below:

    Parameter Description
    Directory Prefix Directory prefix for log files, which supports only the wildcard characters \* and ?*indicates to match any multiple characters.?` indicates to match any single character.
    /**/ Current directory and all its subdirectories.
    File Name Log file name, which supports only the wildcard characters \* and ?. \* indicates to match any multiple characters. ? indicates to match any single character.

    Common configuration modes are as follows:

    • [Common directory prefix]/**/[common filename prefix]*
    • [Common directory prefix]/**/*[common filename suffix]
    • [Common directory prefix]/**/[common filename prefix]*[common filename suffix]
    • [Common directory prefix]/**/*[common string]*

    Below are examples:

    No. Directory Prefix Expression Filename Expression Description
    1. /var/log/nginx access.log In this example, the log path is configured as /var/log/nginx/**/access.log. LogListener will listen for log files named access.log in all subdirectories in the /var/log/nginx prefix path.
    2. /var/log/nginx *.log In this example, the log path is configured as /var/log/nginx/**/*.log. LogListener will listen for log files suffixed with .log in all subdirectories in the /var/log/nginx prefix path.
    3. /var/log/nginx error* In this example, the log path is configured as /var/log/nginx/**/error*. LogListener will listen for log files prefixed with error in all subdirectories in the /var/log/nginx prefix path.
    Note:

    • Only LogListener 2.3.9 and above support adding multiple collection paths.
    • The system does not support uploading logs with contents in multiple text formats, which may cause write failures, such as key:"{"substream":XXX}".
    • You are advised to configure the collection path as log/*.log and rename the old file after log rotation as log/*.log.xxxx.
    • By default, a log file can only be collected by one log topic. If you want to have multiple collection configurations for the same file, please add a soft link to the source file and add it to another collection configuration.

    Configuring the combined parsing mode

    On the Collection Configuration page, select Combined Parsing as the Extraction Mode.

    Configuring the collection policy

    • Full collection: when LogListener collects a file, it starts reading data from the beginning of the file.
    • Incremental collection: when LogListener collects a file, it starts reading data 1 MB ahead of the end of the file (for a file less than 1 MB, incremental collection is equivalent to full collection).

    Usage Limits

    • If the combined parsing mode is used for data parsing, LogListener will consume more resources. It is not recommended to use overly complex plug-in combinations to process data.
    • If the combined parsing mode is used, the collection and filter features of the text mode will become invalid, but some of these features can be implemented through relevant user-defined plug-ins.
    • If the combined parsing mode is used, the feature of uploading logs that fail to be parsed is enabled by default. For logs that fail to be parsed, the input name is the Key and the original log content is the Value for log uploading.
    1. Log in to the CLS console.
    2. On the left sidebar, click Search and Analysis to go to the search and analysis page.
    3. Select the region, logset, and log topic as needed, and click Search and Analysis to search for logs according to the set query rules.
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support