Log Collection

Last updated: 2021-12-03 15:47:23

    This document offers answers to questions that you may have when building a deep learning container image and running deep learning in EKS.

    How do I persistently store logs?

    As EKS containers will be terminated after use, you can view logs only when the Pod is in Running status. Once the Pod status becomes Completed, the following error will be reported:

    Error from server (InternalError): Internal error occurred: can not found connection to pod ***
    

    The following describes persistent log storage methods:

    Redirect

    The redirect method is simpler. You only need to change the terminal stdout to which kubectl logs are output to a file for persistent storage. To do so, run the following command:

    kubectl logs -f tf-cnn >> info.log

    However, when using the redirect method, you should note that the output stream will not flow to the terminal; that is, you cannot view the log output progress on the terminal. If you want to output the content to the screen while storing the command output to a file, you can do so in the following two methods:

    • Use a pipe and the tee command. Run the following command:
      kubectl logs -f tf-cnn |tee info.log
    • You can also run the logsave command to output the content to the screen while storing the command output to the file as follows:
      logsave [-asv] info.log kubectl logs -f tf-cnn
      >?The advantage of `logsave` over `tee` is that with `logsave`, the time will be recorded for each input, and there is a certain spacing between logs, which makes it easier for you to find logs.

    The above three commands all have a shortcoming: as their redirect is based on the kubectl logs output, they must be used when the Pod is in Running status, and they are only used to view logs after the Pod is in Completed status.
    The redirect method is applicable to scenarios with only a small number of logs and with no requirements for outputting and searching for a high number of logs. If your requirements are not high, we recommend you use the redirect method.

    Log collection configuration

    In EKS, you can configure log collection either through environment variables or CRDs.

    1. Configure log collection as instructed in Using Environment Variables to Configure Log Collection
      1. If you want to use keys for authorization, you can create a Secret in Opaque type and create two keys (SecretId and SecretKey). The values of SecretId and SecretKey can be obtained in API Key.
      2. You can find the created Secret after enabling log collection and associate SecretId with SecretKey
    2. Get the raw logs in the console, switch to the table view, and format the JSON strings

    This method has a problem: the log collection feature of EKS works by sending the collected logs as JSON strings to the specified consumer, but the timestamps of the collected JSON strings are at the second level

    In this case, logs are displayed in the console at the second level, and the logs displayed on the search and analysis page can be sorted only by second but cannot be output sequentially at a finer time granularity. However, sometimes a large number of logs are output in a short while, for which a millisecond granularity is often required. Therefore, we recommend the CRD-based configuration method.