Wednesday, April 8, 2009

Five Whys and Four Fingers Pointing Back At Me...

A bug report at work traveled a somewhat strange path as we tried to deduce the root cause of the problem. That strange path reminded me that very frequently when I follow the "Ask Five Whys" heuristic, I discover that there are things which I can change which will improve the situation.

In this particular case, a bug was found, fixed, and then reopened because it was apparently not fixed. There were then several e-mail exchanges between various people as they tried to deduce why the bug was still not fixed. The submitter was confident that the bug was still in the software, so it could not have been fixed. The fixer was confident the change had been submitted, so it must be a problem somewhere else. Others in the conversation wondered if there were additional complications which had not been considered. All of those ideas (and more) could have been correct.

In this specific case, a series of simple gaps were enough to mislead us all.

  1. A translation mistake was discovered in late March
  2. The bug report was assigned to the wrong person, but e-mail exchanges alerted the translation team that the bug existed and needed to be fixed
  3. In early April the corrected translation was added to the source master
  4. Just before the corrected translation was added, a new build was generated as part of our once a week schedule of builds
  5. The submitter tested the fix with the build just prior to the fix
  6. The e-mail discussion was then started trying to understand why the bug was not fixed

When I started asking "Five Whys", I thought it was obvious where the problem originated, and even how to fix it. The bug had been sent to the wrong person, and then when the bug was fixed the bug report was not updated to show which build included the fix.

However, as I stared at the problem further, I realized there was a more significant problem than I had seen initially, and that more significant problem has caused other issues as well.

Why did the fixer need to waste the time guessing which build would include the fix. Couldn't a system tell the submitter when their bug fix was in a build? For example, most bug fixes will reference the bug number in their submit message. Why not pass that information automatically to the submitter, or to the bug report so the fixer does not have to think about the number or name of the next build.

That would have helped, but it appeared that a bigger problem was that the tester did not have easy access to the list of changes which had been made in the build being tested. That list of changes was difficult to find, and difficult to read (I don't find CruiseControl output especially friendly) and probably not known to the submitter at all.

Make Available Information Reachable - Reduce Guessing

When information is not readily available to our very smart people, they will apply skill and judgment and make the best assumptions they can with the information they have. Making that information more readily available will allow them to do their jobs better and reduce wasted work.

The root problem seemed to be someone else's issue, until my thought processes came back to highlight that it was really my problem. I'm the manager, and ultimately it is my fault. Sometimes it becomes obvious more quickly, other times it takes a little more time...

I admit was well that making information easily reachable by those who need it, when they need it, is only part of the answer. That would have helped the tester, but did not help the submitter send the bug to the right person, nor did it help the fixer insert the right data in the bug report. There are so many "why" questions to ask, and so many ways to make small improvements that might help a little.

No comments: