In Apache Spark, the default log level is INFO, meaning it records all logs at INFO level and above, including WARN and ERROR logs. During development or production tuning, excessive INFO logs can obscure critical information, so it is often desirable to adjust the log level to minimize log output.
Method 1: Using Spark Configuration File (Recommended for Cluster Environments)
-
Edit the log4j configuration file: Locate the
confdirectory in the Spark installation path, and copylog4j.properties.templatetolog4j.propertiesif it does not exist. -
Modify the
log4j.propertiesfile: Openlog4j.properties, locate the line setting the log level, such aslog4j.rootCategory=INFO, console, and changeINFOtoERRORor another desired log level to reduce log output.shelllog4j.rootCategory=ERROR, console -
Save and restart the Spark application: After modifying the configuration file, restart your Spark application to apply the changes.
Method 2: Programmatically Adjusting Log Levels (Suitable for Interactive and In-Application Adjustments)
If you are using Spark Shell or your Spark application requires dynamic adjustment of log levels, you can set it directly in code:
scalaimport org.apache.log4j.{Level, Logger} // Get the root logger val rootLogger = Logger.getRootLogger() // Set the log level rootLogger.setLevel(Level.ERROR)
Adding the above code to your Spark application allows dynamic suppression of INFO logs at runtime, retaining only ERROR logs.
Summary
Disabling INFO logging can be achieved by editing the configuration file or using programming methods, depending on your specific requirements and environment. In production environments, it is generally recommended to set appropriate log levels by modifying the configuration file for centralized management and reduced performance overhead. In development or testing environments, a programming approach may be used for more flexible log level adjustments.