Configuring Log Collection via Yaml

Last updated: 2021-07-15 14:18:35

    This document describes how to use CRD to configure the log collection feature of EKS cluster via YAML.

    Prerequisites

    Log in to the EKS console, and enable the log collection feature for the EKS cluster. For more information, see Enabling Log Collection.

    Creating the CRD

    To create a collection configuration, you only need to define the LogConfig CRD. The collection component will modify the corresponding CLS log topics based on changes to the LogConfig CRD and set the bound server group. The CRD format is as follows:

    apiVersion: cls.cloud.tencent.com/v1
       kind: LogConfig                          ## Default value
    metadata:
      name: test                                ## CRD resource name, unique in the cluster
    spec:
      clsDetail:
        topicId: xxxxxx-xx-xx-xx-xxxxxxxx       ## CLS log topic ID. The log topic needs to be created in CLS in advance and should not be occupied by other collection configurations.
        logType: minimalist_log                 ## Log collection format. json_log: json format. delimiter_log: separator-based format. minimalist_log: full text in a single line. multiline_log: full text in multi lines. fullregex_log: single line - full regex. multiline_fullregex_log: multiple lines - full regex.
        extractRule:                            ## Extraction and filtering rule
          ...
      inputDetail:
        type: container_stdout                  ## Log collection type, including container_stdout (container standard output), container_file (container file), and host_file (host file)
    
        containerStdout:                        ## Container standard output
          namespace: default                    ## The Kubernetes namespace of the container to be collected. If this parameter is not specified, it indicates all namespaces.
          allContainers: false                  ## Whether to collect the standard output of all containers in the specified namespace
          container: xxx                        ## Container name in the pod that meets the includeLabels criteria. This parameter is used only when includeLabels is specified.
         includeLabels:                         ## Only Pods that contain the specified labels will be collected.
            k8s-app: xxx                        ## Only the logs generated by Pods with the configuration of "k8s-app=xxx" in the Pod labels will be collected. This parameter cannot be specified at the same time as workloads and allContainers=true.
          workloads:                            ## Kubernetes workload to which the container Pod to be collected belongs
          - namespace: prod                     ## Workload namespace
            name: sample-app                    ## Workload name
            kind: deployment                    ## Workload type. Supported values include deployment, daemonset, statefulset, job, and cronjob.
            container: xxx                      ## Name of the container to collect. If this parameter is not specified, it indicates all containers in the workload Pod will be collected.
    
        containerFile:                          ## File in the container
          namespace: default                    ## The Kubernetes namespace of the container to be collected
          container: xxx                        ## Name of the container to be collected
         includeLabels:                         ## Only Pods that contain the specified labels will be collected.
            k8s-app: xxx                        ## Only the logs generated by Pods with the configuration of "k8s-app=xxx" in the Pod labels are collected. This parameter cannot be specified at the same time as workload.
          workload:                             ## Kubernetes workload to which the container Pod to be collected belongs
            name: sample-app                    ## Workload name                  
            kind: deployment                    ## Workload type. Supported values include deployment, daemonset, statefulset, job, and cronjob.
          logPath: /opt/logs                    ## Log folder. Wildcards are not supported.
          filePattern: app_*.log                ## Log file name. It supports the wildcards "*" and "?". "*" matches multiple random characters, and "?" matches a single random character.
          customLablels
            k1: v1
    

    Log Parsing Format

    A log with full text in a single line means a line is a full log. When CLS collects logs, it uses the line break \n to mark the end of a log. For easier structural management, a default key value __CONTENT__ is given to each log, but the log data itself will no longer be structured, nor will the log field be extracted. The time attribute of a log is determined by the collection time. For more information, see Full Text in a Single Line.

    Assume that the raw data of a log is:

    Tue Jan 22 12:08:15 CST 2019 Installed: libjpeg-turbo-static-1.2.90-6.el7.x86_64
    

    A sample of LogConfig configuration is as follows:
    apiVersion: cls.cloud.tencent.com/v1
    kind: LogConfig
    spec:
      clsDetail:
        topicId: xxxxxx-xx-xx-xx-xxxxxxxx
        # Single-line log
        logType: minimalist_log
    

    The data collected to CLS is as follows:
    __CONTENT__:Tue Jan 22 12:08:15 CST 2019 Installed: libjpeg-turbo-static-1.2.90-6.el7.x86_64
    

    Log Collection Types

    Container standard output

    Sample 1: collecting the standard output of all containers in the default namespace

    apiVersion: cls.cloud.tencent.com/v1
    kind: LogConfig
    spec:
      inputDetail:
        type: container_stdout
        containerStdout:
          namespace: default
          allContainers: true
    ...
    

    Sample 2: collecting the container standard output in the Pod that belongs to ingress-gateway deployment in the production namespace

    apiVersion: cls.cloud.tencent.com/v1
    kind: LogConfig
    spec:
      inputDetail:
        type: container_stdout
        containerStdout:
          allContainers: false
          workloads:
          - namespace: production
            name: ingress-gateway
            kind: deployment
     ...
    

    Sample 3: collecting the container standard output in the Pod whose Pod labels contain “k8s-app=nginx” under the production namespace

    apiVersion: cls.cloud.tencent.com/v1
    kind: LogConfig
    spec:
      inputDetail:
        type: container_stdout
        containerStdout:
          namespace: production
          allContainers: false
          includeLabels:
            k8s-app: nginx
     ...
    

    Container file

    Sample 1: collecting the access.log file in the /data/nginx/log/ path in the nginx container in the Pod that belongs to ingress-gateway deployment under the production namespace

    apiVersion: cls.cloud.tencent.com/v1
    kind: LogConfig
    spec:
      topicId: xxxxxx-xx-xx-xx-xxxxxxxx
      inputDetail:
        type: container_file
        containerFile:
          namespace: production
          workload:
            name: ingress-gateway
            type: deployment
          container: nginx
          logPath: /data/nginx/log
          filePattern: access.log
     ...
    

    Sample 2: collecting the access.log file in the /data/nginx/log/ path in the nginx container in the Pod whose pod labels contain “k8s-app=ingress-gateway” under the production namespace

    apiVersion: cls.cloud.tencent.com/v1
    kind: LogConfig
    spec:
      inputDetail:
        type: container_file
        containerFile:
          namespace: production
          includeLabels:
            k8s-app: ingress-gateway
          container: nginx
          logPath: /data/nginx/log
          filePattern: access.log
     ...
    

    Metadata

    For container standard output (container_stdout) and container files (container_file), in addition to the raw log content, the container metadata (for example, the ID of the container that generated the logs) also needs to be carried and reported to CLS. In this way, when viewing logs, users can trace the log source or search based on the container identifier or characteristics (such as container name and labels).

    The following table lists the metadata:

    Field Name Description
    cluster_id The ID of the cluster to which logs belong
    container_name The name of the container to which logs belong
    image_name The image name IP of the container to which logs belong
    namespace The namespace of the Pod to which logs belong
    pod_uid The UID of the Pod to which logs belong
    pod_name The name of the Pod to which logs belong
    pod_ip The IP of the Pod to which logs belong
    pod_lable_{label name} The labels of the Pod to which logs belong (for example, if a Pod has two labels: app=nginx and env=prod, the reported log will have two metadata entries attached: pod_label_app:nginx and pod_label_env:prod).