Is printf debugging obsolete?

According to wikipedia debugging is a methodical process of finding and reducing the number of bugs, or defects, in a computer program or a piece of electronic hardware, thus making it behave as expected.

The most popular technique is to use a debugger. A debugger is a computer program that is used to test and debug other programs (the “target” program). Typically, debuggers offer functions such as running a program step by step (single-stepping or program animation), stopping (breaking) (pausing the program to examine the current state) at some event or specified instruction by means of a breakpoint, and tracking the values of variables. Some debuggers have the ability to modify program state while it is running. It may also be possible to continue execution at a different location in the program to bypass a crash or logical error.

Using a debugger is a very powerful and popular debugging method, but it is not the best option for all cases. You might be triaging software that controls hardware. And hitting a break point at the wrong moment might damage that hardware (I’ve burned a motor this way). Or you might be trying to understand a failure that happened while the debugger was not attached.

A very simple alternative for these cases is to use print debugging. Print (or trace) debugging is the act of watching (live or recorded) trace statements, or print statements, that indicate the flow of execution of a process. This is sometimes called printf debugging, due to the use of the printf statement in C. This kind of debugging was turned on by the command TRON in the original versions of the novice-oriented BASIC programming language. TRON stood for, “Trace On.” TRON caused the line numbers of each BASIC command line to print as the program ran.

As an embedded software developer I typically combine both techniques:

  1. I analyze the logs captured during product testing in order to find the steps to to reproduce the failure at my desk.
  2. Then I use a debugger to step trough the failure and find the root cause.

After more than 10 years of using logs to triage defects on my software I’ve identified two key elements to make this technique efficient:

  1. Selecting the trace statements wisely.
  2. Having a good log analysis method.

Defining the trace statements is a balancing act. On one side you want to add as few messages as possible to reduce the performance impact and keep the size of the logs manageable. But at the same time you want to make enough statements to understand  all the potential bugs. There are several tricks to help with the selection: like separating the trace statements in to channels, making the trace level configurable at run time, or generating encoded logs that will be expanded before analyzing them. I will talk more about this on a separate post. For now I would just like to point out that it is an ongoing effort that will continue until the end of the project. So start with your best guess and then add and delete statements as needed.

Having a good log analysis method  is as important as having the right trace statements. Having the information in the log will not help you unless you are able to “find the clues”. And since the defect fixing phase of a project can last for months it is important to do it as efficiently as possible. Specially if you still pretend to have a life after work ;^).

The procedure I developed over time was:

  1. Open the log with my favourite text editor.
  2. Use its search features to locate the relevant strings.
  3. Add an extra line with my comments at each key point.
  4. Use grep to extract all my comments and create an index.
  5. Use the index to document the failure mode and navigate trough the log.

The disadvantages of this process were:

  1. You need to remember and type each one of the key strings you want to search.
  2. You modify the original log by adding your comments.
  3. You have to work with several files “at the same time”. (I ended up having the log file, the index file and my common strings to search file)

I finally got tired of this error prone and labor intensive process and started looking for a way to automate it. I did a web search and found lots of tools. But none of them satisfied my needs. The main problems with those log analyzers or log parsers  where:

  • They were designed for a specific type of logs (like the web server logs) and where not easy to adapt to my software output.
  • They will perform the log analysis  to produce a report with statistic information.

What I needed was a generic log analyzer (or log parser) but I couldn’t find it. So I finally decided to creating my own log analysis tool at home as a hobby. When I started using my log analyzer at work I noticed that it reduced my analysis time by half while increasing the accuracy of my results. So I started sharing the tools with my coworkers who also found it useful.

That is why I’ve decided to publish my log parser tool online hoping it can be useful to other developers who might be looking for a better log analysis option. Please feel free to learn more about my log analyzer  (or log parser) and/or download it for a free 30 day period trial.

This entry was posted on Friday, July 19th, 2013 at 8:58 am and is filed under LogAnalyzer. You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

Comments are closed.