Thursday, October 29, 2009

Dangerous Interruptions

Bob Sutton's blog post on reducing medical interruptions reminds me of Sunday mornings when I take my mother-in-law from her nursing home to church. I frequently interrupt or disrupt the nursing staff with my out of sequence, unpredictable arrival, and with my desire to get "Nana" to church on time.

Nana's trip from the nursing home to church starts from her room at the nursing home. I arrive at her room between 10:20 AM and 10:40 AM (depending on how late I arrive from home). Sunday is the only day we do this, so I tend to disrupt all sorts of people at the nursing home, including the nursing staff.

My mother-in-law is a diabetic, and church runs over the noon hour when she would normally receive her medications. The nurse would normally check her blood sugars right before lunch, then based on the results of that blood sugar test, she would select the proper dose of insulin, draw that dose, and administer the dose to Nana. Each of those steps has a potential for error, and each of those steps needs careful thought and attention to detail by the nurse.

Because I arrive as much as 90 minutes prior to lunch, and Nana will be gone for the three hours of church, the nurse is required to interrupt her current medication process, test Nana, medicate Nana, and then return to her previous task. The nurses are always very kind about handling the interruption, and they provide great care. I worry that my interruption may cause them to make unnecessary mistakes...

The study which Bob Sutton references was performed in the UCSF hospital system in San Francisco and is described in a San Francisco Chronicle article. The study was an attempt to reduce the frequency of medication errors at hospitals. They used both low-tech solutions and high tech solutions to reduce medication errors by nurses.

The low tech solutions described in the article focused on reducing nursing interruptions when administering medication. The article describes "do not interrupt" sashes and vests, closing blinds to prevent distractions, and other relatively simple techniques to reduce interruptions during the crucial activity of administering medication. The article noted that the nursing teams were encouraged to develop their own solutions, within their own working environment (own your process). It appears from the study that nurses administer medications (a detailed technical task) less accurately when they are interrupted than when they are undisturbed. It also appears that nurses allowed to explore improvement techniques tend to improve.

The study may not directly apply to my software development team, but I think there are several lessons I should take from the study. They are lessons others have noted, but the article serves as a good reminder.

  • Interrupting technical work (pair programming, software design, software testing, etc.) increases the chances for error. I need to interrupt my people less
  • Allowing and encouraging people to improve their own processes, their own ways of working is likely to generate improvement. I need to find ways to acknowledge my mistakes openly, learn from those mistakes, and encourage others to do the same. A software bug is a late manifestation of a mistake, mistakes will happen in human endeavors, and we want to learn from those mistakes, not hide them until later
  • Fear of failure tends to hide those failures, particularly in organizations with a culture of fear. Sutton's posting notes that hospitals which acknowledge and seek to reduce their drug administration errors tend to report 10x more drug administration errors than units with a more punitive attitude towards errors. The failures will still occur, but they will be discovered later, and likely be discovered with more damage done, or higher costs incurred from the failure. It would be a gross mistake to declare the nursing unit which reports 10x the drug administration errors as a failed unit without further investigation. If the clinical results of the unit are better (fewer deaths, fewer injuries, lower costs, etc.), then the larger number is actually highlighting their good practice of learning, rather than the bad practice of medication errors. Don't worry about bug counts, worry about what bugs can tell us about how to be better
  • "Best practices" at one location were not necessarily "best" at another location, although sharing practice based experiences seems like a good way to learn from mistakes and thus make fewer mistakes in the future.

Friday, October 23, 2009

Learning and Responding

A mistake was made today. Code was merged from one branch to another branch and the destination branch was broken in its intended use. The break was detected late in the day of the team that caused the break, and they had already mostly left for the day. The break highlighted all sorts of weaknesses in how I was handling things, including things like

  • Why didn't I make it clear to everyone both the purpose and the target configuration of each branch? Poor communication I had not made it clear what the purposes and expected configuration was for each branch, and they assumed that since they could see the branch building in one case, that was sufficient
  • Why didn't the person who performed the merge detect the broken build on the continuous integration server? Unclear information sources. We had configured 3 different continuous integration servers because we needed three different configurations. Unfortunately, I then "muddied the waters" by having one of the branches made compatible with all three configurations, and actively visible on all three configurations. When the developer performed the merge, they saw that it was "green" on the screen they were watching, and thought they were done. It had gone "red" on the other two servers, and those two were the most important to my team
  • Why wasn't the team which performed the harmful merge able to repair their damage? Unavailable spare configuration They had no machine available to them which matched the problem configurations and could be used for diagnosis and development. Their machines were all configured for their needs, and the break was in an area needed by other teams
  • Why did it take half a day to recover from the damage? Inexperience with our tools We recently switched from Perforce to Subversion to Git and the transition has left us less skilled in dealing with the complexity of this type of failure.

All told, the damage cost my team less than a day to recover, and because we're using a distributed version control system, they were able to continue their work locally, but they were not able to push to the central repository.

Moral of the story: Communicate clearly, listen carefully, and be willing to change as better ideas arrive

Saturday, October 10, 2009

Ask More Questions - Get More Answers

I had been tolerating a nagging problem with my web browser on multiple machines and now have a solution, because I had the presence of mind to finally ask a question.

Firefox is my preferred web browser because it includes the Gmarks plugin. The Gmarks plugin brings my google bookmarks into the Firefox menu. That makes portable bookmark management easier (all my bookmarks are stored at Google, visible from any web browser on the internet).

I also prefer the Foxit PDF reader instead of the Adobe reader. It feels faster, cleaner, and seems less likely to be attacked by malware (smaller installed base, newer code base).

Unfortunately, Firefox would report "OCX failed to load" when I tried to open a PDF file. I had found all sorts of strange alternatives for opening PDF files in Firefox. For example, sometimes I would download the PDF file, then open it in Foxit from the local machine. Other times I would copy and paste the URL from FIrefox to Internet Explorer, then use Internet Explorer to open the PDF file.

All those strange alternatives (work arounds, fixes, etc.) have now stopped. I was weary of the alternatives, so I used Google to search for the error message. In classic Google search fashion, the first page had a perfect match for my needs, a post which described my problem and an easy solution to my problem.

The moral: Ask questions sooner, don't be afraid of questions or their answers. I'll need to think more about why I didn't ask the question sooner...