Why do we have different log severities (levels) for logs?
Over the years I have worked with many different log frameworks, both open source and in house. One of the things that are often not that different are the severity levels for logs, which often come down to these (with some variations):
- Fatal
- Error
- Warning
- Information
- Debug
- Verbose
The above is from memory, but it is close to what log4j and serilog uses. That is a lot of log levels and it can be hard to understand when to use what, therefore I am sharing my own guidelines:
- Fatal: When a fatal log is created something is very wrong, it could be that the whole application is going down. This log level should be used when something has to be looked into as soon as possible.
- Error: The most common use of the error log level is an exception. Something has happened that should not have happened and you likely need to take a look at it. It does not have to be an exception that causes an error log, it could also be a validation that fails or a path in the code that should not have been reached.
- Warning: Warnings are not necessarily errors, but tells us that something is off. An example would be if a call failed, therefore we will automatically try again in 10 seconds and we log that as a warning. It can be a gray area when to use warning and information.
- Information: Information can be anything that is useful, everything from "This service has received/sent a message" or that a service is starting up.
- Debug: debug is used while debugging, often production servers do not log this level and ignores everything logged below info.
- Verbose: verbose is everything. Have you ever tried turning on verbose logging during a build? If you have you know you will get an endless amount of lines to go through. Verbose is used for normally unimportant logs that are nice to have in rare situations.
The most used log level in a system is normally information. This is also one of the "neutral" log levels, what is logged is not necessarily good or bad. Logs at this level are often investigated when something went wrong silently. An example would be if a message in the system is missing, it disappeared somewhere along the way. Here information logs can help us determine where this message was last seen. With only warnings, errors or fatal messages in our system we lose tracking in this scenario.
The error log level is often the next in line as the most used. As previously mentioned a normal use case is an exception or some other sort of error handling. Even though less may be logged as errors, these are the most interesting logs and likely the ones you look into in your daily work. The log levels below error are often not interesting unless you know something is wrong. Error and fatal logs are what tells you that something is wrong in your system.
Warning and fatal logs are log levels in a gray area to me. Especially warnings, I have never been on a team where we treated warnings differently than information logs. The cases where I have used warning is "if something fails and we will retry it later" or "if we received something twice but should not have" (where we ignore the second or are idempotent). Both scenarios could just have been information. Fatal logs are very rare, some teams use this log level to be called immediately when it happens, others use it just as "a more severe error". The latter makes little sense, but I have often seen it used when an application fails so hard it crashes (fatally fails).
Debug and verbose are often only used during development or to track down bugs. They are normally turned off in the production environment as they a) log messages that are large in size, b) log an immense amount of messages and c) are not that useful in the day-to-day business.
So why do we have different log levels? Because different levels of logs are used for different purposes. Some have to be looked into as soon as possible and others are only interesting if you know something is off. Having a severity on logs also makes people aware if they should react to it or not. If you use a log aggregation framework, different severities also help you find what is important, and filter out what is unimportant.
This was my post on a log levels and what they mean, let me know in the comments if I left something out.