There is a process that I’ve used to debug everything from 10-line toy programs to 100,000-line systems. It works as follows:
Identify that something is wrong, and what is wrong about it.
Identifying bugs is a complicated and interesting process that includes testing, validation, user studies, and so much more. But I expect that if you came to this page it is because you think your code has a bus so it’s probably already identified.
Understand the design and process of the part of the code that is involved in the buggy behavior. If you can’t explain what steps the code should be going through to create the desired behavior you also can’t fix the bug.
Locate the first mistake the code made. With rare exceptions, this is always done by iterating the following steps:
Pick some point on the code path to the symptom you identified
Describe what the code is supposed to be doing there, often by identifying what values the in-scope variables should have.
Check to see whether it’s doing what was expected at that point.
For small programs, print statements might be enough to do this. For larger programs, a visual debugger is much more efficient and useful. With increased experience, you may eventually check the code by looking at it without running it, but that’s an unreliable technique at best.
If it isn’t doing what you expect, double-check the expectation: the program could be wrong, but so could the your expectations.
Based on the result, narrow the region of code that could have led to the bug and repeat. If it’s right there, look closer to the symptom; if it’s wrong there, look earlier to see where it started being wrong.
In my experience, when students fail to debug their code the most common reason is they didn’t iterate through these four steps enough times to find the first manifestation of the bug. There may be some better bug location system than this loop, but in two decades of debugging my own code and helping students debug theirs, I’ve never found it.
Change the code to remove the bug.
Sometimes this is a change to the code at the location found in step 3. But there are two common classes of bugs that require changes in other locations.
The first nonlocal case is the missing component
bug, where the location identified in the previous step needs to do something that depends on data not available at that location. Fixing this involves figuring out what data is missing, finding when and where it was available, and designing and implementing a data path to get it to the place where it is needed. Data paths can be as simple as a single variable or as complicated as multiple new data structures and function parameters. In a few cases the changes might be so significant that it’s easier to start over with a better design.
The second nonlocal case is the aliased meaning
bug, often caused by poor variable naming, where the same variable is used to store different content at different points in the program. This case has most of the characteristics of a missing component bug: you’ll need a new data path (often a new variable) to transmit both content simultaneously. It additionally requires revisiting each use of the old overloaded variable and deciding which meaning was intended there.
When the change is more than a few trivial steps, take it slowly with multiple rounds of intermediate testing. If one part of a multi-part change doesn’t work, it can be hard to tell which part failed unless each was added and tested individually. Additionally, debugging generally involves modifying code that is no longer fresh in your mind, so it will take more thought and time to fix that it did to write the first time, another reason why small, careful steps are the best approach.
Decreasing the rate of bug introduction
Debugging is generally less enjoyable than other parts of software development, and while it is not usually something you can fully avoid there are practices that can significantly reduce the amount of debugging you need to spend.
If you can’t do the task by hand the way the computer should, you can’t tell the computer how to do it.
If you don’t know what data you want to store or how it moved between different parts of the program, you can’t tell the computer how to store it and move it.
If you haven’t drawn a picture or made an outline or written some pseudocode, your job will be much harder, both in creating the code and in knowing where to look when debugging it.
what’s this part of your code do?
If you need to write a nontrivial bit of logic somewhere in your code,
Trying to build and test code in-place invites confusion as to whether it is what you wrote or how you used it that is causing errors.
Debugging large code is frustrating. To avoid ever debugging large code, build it in small steps:
The larger your task is, the more important this process for writing it becomes.
One of the most time-honored, well-tested, and revered principles of software development was labeled by Fred Brooks as plan to throw one away.
Write your code, then delete it and start over. Your new code will be faster to write, better designed, and pass more tests than the first. It will be faster and easier to re-write it without bugs than to find and fix the bugs in the first version.
This may seem counter-intuitive. How could starting over be easier? Because you learned by trial and error while writing the first version, but on your second version you already know what you should do. No trial and error = no error = better code.
This only works if you continue on the first version until you fully understand the task. If you get stuck with I don’t know how to begin to add feature X
then starting over is unlikely to help.