Prove the problem and prove you're making a difference.
In reverse order...
Prove you're making a difference
Quite often you'll be hitting something for hours only to find that the thing you're hitting isn't even connected to the thing that's broken. You've been editing a functional which isn't called, or changing a stylesheet for the wrong part of the app. If the hour is particularly late, you'll find yourself editing entirely the wrong version of the site before you realise.In these cases, and it all cases where a trial does produce what you expect start by ref checking the plug. Force it to break. Stick something in there which would absolutely fail fatally if it were run. If it doesn't fail, it hasn't been run and you have just proved you don't have the control you think you do.
The next step is some assumption reversal. You assumed the function was being run; it isn't. Take other assumptions and reverse them until you find a reversal which is true.
How you do this depends on the language and environment, but the rule remains: just do something which is bound to fail. Such as:
noSuchFunctionOr
die('Hello!');Or
select invalid_field from no_such_table where syntax errors all over the place;
You get the idea. Do something which is going to cause a fatal error and if it doesn't, things aren't as you expect.
Stimulate the failure - i.e. repeat the bug. Record everything.
Rule is: find a reliable, replicable way of reproducing the error.
Information gathering: verifying that the issue does exist.
The reasons for finding a reliable way of reproducing the error are:
- So you can look at it
- So you can focus on the cause
- So you can tell if you've fixed it
Prove the problem
This is basic: replicate the problem yourself. This should be your first tasks whenever some new hellish task comes in. Work out how to create the problem for yourself.The author, Agans, discourages from simulating the failure. However, it is sometimes it is worth forcing the symptoms of the bug to occur in unnatural circumstances.
For instance, under extremely high network loads a program might slow down but creating sufficiently high network loads might not be either practicable (it might stop the debug logs running). In order to debug the program you would cause the program to think that there were high network loads. ((Expand this example)).
"But that can't happen" equates to part of my axiom of debugging: it's broken, I don't understand it and I haven't got time to understand it all...