GooseFS Logs

Last updated: 2021-09-30 12:29:06

    When GooseFS’s master and workers or computing frameworks such as Spark request GooseFS through a GooseFS client, logs will be recorded for troubleshooting. GooseFS outputs logs based on Log4j. Therefore, you can modify log4j.properties to change the log output configuration, such as the storage path, log level, or whether to record RPC logs. You can go to the GooseFS configuration directory and modify log4j.properties as follows:

    $ cd /usr/local/service/goosefs/conf
    $ cat log4j.properties
    # May get overridden by System Property
    log4j.rootLogger=INFO, ${goosefs.logger.type}, ${goosefs.remote.logger.type}
    log4j.category.goosefs.logserver=INFO, ${goosefs.logserver.logger.type}
    log4j.additivity.goosefs.logserver=false
    log4j.logger.AUDIT_LOG=INFO, ${goosefs.master.audit.logger.type}
    log4j.additivity.AUDIT_LOG=false
    ...
    

    The log configuration of GooseFS is described below.

    Storage Location

    By default, logs collected by GooseFS are stored in ${GOOSEFS_HOME}/logs, where master logs are stored in logs/master.log and worker logs in logs/worker.log. Note that errors thrown by node processes are recorded in master.out or worker.out, which are empty if there is no error. If any error occurs, you can use these files to troubleshoot.

    Below are common configuration items for the master log storage:

    # Appender for Master
    log4j.appender.MASTER_LOGGER=org.apache.log4j.RollingFileAppender
    log4j.appender.MASTER_LOGGER.File=${goosefs.logs.dir}/master.log
    log4j.appender.MASTER_LOGGER.MaxFileSize=10MB
    log4j.appender.MASTER_LOGGER.MaxBackupIndex=100
    log4j.appender.MASTER_LOGGER.layout=org.apache.log4j.PatternLayout
    log4j.appender.MASTER_LOGGER.layout.ConversionPattern=%d{ISO8601} %-5p %c{1} - %m%n
    

    Parameters are described as follows:

    • MASTER_LOGGER: configures master log output.
    • MASTER_LOGGER.File: sets the log storage path. You can modify the value to customize a storage path.
    • MASTER_LOGGER.MaxFileSize: sets the maximum size of a single log file.
    • MASTER_LOGGER.MaxBackupIndex: sets the maximum number of log files.
    • MASTER_LOGGER.layout: specifies the layout of the output log.
    • MASTER_LOGGER.layout.ConversionPattern: specifies the format of the output log.
    Note:

    • .log files are rolled. You can back them up to a UFS such as COS. However, .out files are not rolled and thus need to be deleted manually if needed.
    • For more information about Log4j parameters, please see Log4j Configuration.
    • GooseFS stores only logs generated by itself. For logs generated by upper-layer computing applications, view the specific application’s configuration for the log location. For the log configurations of common computing applications, please see Apache Hadoop, Apache HBase, Apache Hive, and Apache Spark.

    Log Levels

    GooseFS has the following five log levels:

    • TRACE: finer-grained calling logs that are suitable for debugging method/class calls.
    • DEBUG: fine-grained calling logs that are useful for debugging.
    • INFO: important information about request handling
    • WARN: warning information (the task can still run, but there might be potential problems)
    • ERROR: error message (the running of the task is affected)

    The five log levels are ordered according to how detailed the logs are (the first level is the most detailed). A higher-level log also records log messages recorded in a lower-level one. By default, the log level of GooseFS is set to INFO, which records log messages of INFO, WARN, and ERROR.

    You can go to GooseFS’s configuration directory and modify log4j.properties. The following example changes all log levels of GooseFS to DEBUG:

    log4j.rootLogger=DEBUG, ${goosefs.logger.type}, ${goosefs.remote.logger.type}
    

    To modify the log level of a specified class, you can declare it in the configuration file. The following example sets the log level of the GooseFSFileInStream class to DEBUG:

    log4j.logger.com.qcloud.cos.goosefs.client.file.GooseFSFileInStream=DEBUG
    

    In most cases, you are advised to change the log level in the logging configuration file. However, sometimes you might need to change the logging parameters when the cluster is running. In this case, you can run the goosefs logLevel command to modify the log level. The following are configuration items supported by logLevel:

    usage: logLevel [--level <arg>] --logName <arg> [--target <arg>]
      --level <arg>     The log level to be set.
      --logName <arg>   The logger's name(e.g. com.qcloud.cos.goosefs.master.file.DefaultFileSystemMaster) you want to get or set level.
      --target <arg>    <master|workers|host:webPort>. A list of targets separated by, can be specified. host:webPort pair must be one of workers. Default target is master and all workers
    

    The configuration items are described as follows:

    • level: log level, which can be TRACE, DEBUG, INFO, WARN, or ERROR
    • logName: the logger’s name, such as com.qcloud.cos.goosefs.underfs.hdfs.HdfsUnderFileSystem
    • target: targets to apply the change to, which can be the master or workers (specified by IP:PORT). By default, the change applies to the master and all workers.

    You can change the log level when the system is running as needed to troubleshoot. The following example changes the log level of the com.qcloud.cos.goosefs.underfs.hdfs.HdfsUnderFileSystem class to DEBUG for all workers, and changes it back to INFO when the debugging is complete:

    $  goosefs logLevel --logName=com.qcloud.cos.goosefs.underfs.hdfs.HdfsUnderFileSystem --target=workers --level=DEBUG # Set to DEBUG.
    $  goosefs logLevel --logName=com.qcloud.cos.goosefs.underfs.hdfs.HdfsUnderFileSystem --target=workers --level=INFO # Set to INFO.
    

    Advanced Configurations

    GooseFS allows you to configure GC event logs, FUSE logs, RPC logs, and UFS operation logs, and perform operations such as log segmentation and log filtering. The following describes how to use these advanced configurations.

    • GC event logs
      GooseFS records GC event logs in .out files. You can add the following configuration to the conf/goosefs-env.sh file:

      GOOSEFS_JAVA_OPTS+=" -XX:+PrintGCDetails -XX:+PrintTenuringDistribution -XX:+PrintGCTimeStamps"
      

      GOOSEFS_JAVA_OPTS is the Java VM parameter for all GooseFS nodes. You can also use GOOSEFS_MASTER_JAVA_OPTS and GOOSEFS_WORKER_JAVA_OPTS to specify the VM parameter for the master and workers, respectively.

    • FUSE logs
      You can set the log level for FUSE in the conf/log4j.properties file:

      goosefs.logger.com.qcloud.cos.goosefs.fuse.GoosefsFuseFileSystem=DEBUG
      

      After enabling it, you can view the FUSE logs in logs/fuse.log.

    • RPC logs
      You can use the conf/log4j.properties file to configure the RPC logs for the client or the master.
      In log4j.properties, configure the RPC request logs for the client:

      log4j.logger.com.qcloud.cos.goosefs.client.file.FileSystemMasterClient=DEBUG # RPC request logs between the client and FileSystemMaster
      log34j.logger.com.qcloud.cos.goosefs.client.block.BlockSystemMasterClient=DEBUG # RPC request logs between the client and BlockMaster
      

      Run the logLevel command to configure the RPC request logs for the master:

      $ goosefs logLevel \--logName=com.qcloud.cos.goosefs.master.file.FileSystemMasterClientServiceHandler \--target master --level=DEBUG # File-related RPC request logs
      $ goosefs logLevel \--logName=com.qcloud.cos.goosefs.master.block.BlockSystemMasterClientServiceHandler \--target master --level=DEBUG # Block-related RPC request logs
      
    • UFS operation logs
      To configure UFS operation logs, you can set the log4j.properties file. Alternatively, you can run the logLevel command as follows:

      $ goosefs logLevel \--logName=com.qcloud.cos.goosefs.underfs.UnderFileSystemWithLogging \--target master --level=DEBUG # Record UFS operation logs for the master.
      $ goosefs logLevel \--logName=com.qcloud.cos.goosefs.underfs.UnderFileSystemWithLogging \--target workers --level=DEBUG # Record UFS operations logs for workers.
      
    • Log segmentation
      GooseFS allows you to store different types of logs in different locations. If all logs are stored in the .log files, the following problems may occur:

      • If the cluster is large or the throughput is high, master.log or worker.log may become extremely large, or lots of logs will be rolled.
      • Log analysis will become difficult if there are too many logs.
      • Lots of logs are stored in the local node and consume storage.

      To solve the problems above, you can configure log4j.properties to set locations for specific types of logs. The following example stores StateLockManager logs in statelock.log:

      log4j.category.com.qcloud.cos.goosefs.master.StateLockManager=DEBUG, State_LOCK_LOGGER
      log4j.additivity.com.qcloud.cos.goosefs.master.StateLockManager=false
      log4j.appender.State_LOCK_LOGGER=org.apache.log4j.RollingFileAppender
      log4j.appender.State_LOCK_LOGGER.File=<GOOSEFS_HOME>/logs/statelock.log
      log4j.appender.State_LOCK_LOGGER.MaxFileSize=10MB
      log4j.appender.State_LOCK_LOGGER.MaxBackupIndex=100
      log4j.appender.State_LOCK_LOGGER.layout=org.apache.log4j.PatternLayout
      log4j.appender.State_LOCK_LOGGER.layout.ConversionPattern=%d{ISO8601} %-5p %c{1} - %m%
      
    • Log filtering
      GooseFS allows you to set conditions to filter and record logs instead of recording all logs. For example, during performance testings, some RPC logs need to be recorded. However, not all logs but only those with high latency are needed. In this case, you can configure the log4j.properties file to add log filtering conditions. The following example filters logs for requests that have an RPC latency of more than 200ms and FUSE latency of over 1s:

      goosefs.user.logging.threshold=200ms
      goosefs.fuse.logging.threshold=1s