Friday, October 1, 2010

Team Centered Bug Finding

We've agreed to deliver a "release ready" version of our internal use components every three weeks. That is a nice step from the "deliver once a year" model we've used in the past, and we're trying to make the transition to the new release cycle successfully.

As part of that transition, we've returned to a practice we had before. We hold once a week test days for the entire development team. "Test Friday" is dedicated to testing.

Positives
  • Test day happened! It feels good to dedicate one day per week to whole team testing
  • Color printer sitting in the pairing area and pens for handwritten notes on the sheets was a great way to rapidly record bugs as they were detected, without too badly interrupting the flow of testing for entering bugs into a bug tracking database. It made a difference that the printer was within a few steps of the developer desks
  • Bug triage at the end of the testing day with the papers on tables in a large conference room seemed to work reasonably well
  • Bug pages with pictures were much quicker to process than bug pages which only contained words
  • Chance to use products "end to end" increases our exposure to customer experiences, helps us understand how the products feel, and ultimately will help us make better products
  • Native language speakers make localization testing much more effective (German and Russian native speakers in the team are a great help)
  • Having a "reference machine" with a previous version of the products was a good idea
Negatives
  • System setup too time consuming - some of us spent most of the test time configuring systems to meet the base requirements before we could install and test our own code
  • Builds had to be taken from a "temporary branch" because one of the other teams in the company had provided code which was not ready to install. Building on that temporary branch meant using a temporary machine, and the temporary machine was not configured the same as the standard build machines, so it did not sign the executables
  • Inexperienced with bug detection techniques to rapidly detect if there is a problem in "known hot spots". Log file analyzers and other bug detection oracles need to be added to the code and used in our investigations
  • The "reference machine" did not have enough configuration to use as a reference in the most important areas

Saturday, August 28, 2010

64 bit Debian Testing on a low power desktop machine

I've been experimenting with a cheap, low power, quiet computer running the 64 bit Debian testing distribution ("Squeeze"). The experience has included delights, several positives and some negatives.

Delights:

Positives:

  • No virus scanner performance penalty for my disc accesses
  • Frequent software updates keep the system running smoothly
  • Interesting new things to learn in an unfamiliar environment
  • Ample memory (4 GB), a dual core processor and hyperthreading keep the computer feeling responsive even when I run several different demanding tasks

Negatives:

  • My experimental code was not ready for 64 bit, and is still not ready for 64 bit, so it has taken some work to get the experimental code running on this new environment
  • Wireless networking has not been successful on this low end machine. Part of that is my own fault because I can only invest small amounts of time in the setup

My most sincere thanks to the Debian project team, the Google development team, the Hudson development team, and so many others. They've provided a free software stack that looks and feels smooth, interesting, and fun.

Monday, July 26, 2010

Prepared or Intentionally Unprepared for a Meeting?

I dislike disorganized, poorly structured, or meandering meetings. Poor meetings treat the participants disrespectfully by wasting time and energy.

Yet, today I consciously chose to enter a meeting unprepared. I made that decision because the preparation for this specific meeting (a technical estimation meeting between a very few well aligned, frequently interacting, highly skilled people) would have used their time even less efficiently than performing that preparation during the meeting itself.

More typically I've felt the other way, that there were plenty of more effective ways I could have prepared for those meetings than spend the time in the meeting doing the preparation. This was a special (and rare) case.

I suspect there are "investment" heuristics which can be applied to the subject, the participants, the duration, my role, and the relative business value of a meeting. I envision the heuristic would provide a suggestion of how much time I should spend on the meeting based on those attributes. Unfortunately, google search did not show me that heuristic, or even a list of attributes of meetings which would help me decide how much to invest in preparation.

Wednesday, May 5, 2010

Windows Home Server - Shock & Surprise & a happy ending

My home computers connect to an HP MediaSmart Server LX-195. That little server performs nightly backup, uploads files to Amazon S3 for backup, and has generally been a delightful little box (after overcoming the initial installation hurdles).

A week or so ago, the client computers were displaying a terrifying message that there were conflicts on the server, and backup was offline. When I examined the console, it showed a new disc added and listed the same exact disc as "removed" and "unhealthy". The disc it was say had a problem was a new 2 TB Western Digital USB drive.

I was confident the drive was not dead, but something had caused Windows Home Server to decide the drive was dead, then had come back to life as something different.

I rebooted the server. No change in behavior. I experimented with other small (non-damaging) tests. No change in behavior. I left it alone and assumed I would need another hour or two on the phone to resolve the problem.

One day I had the ingenious idea to shutdown both the server and the USB hard disc. I then switched on the USB hard disc and after it had some time to fully awaken, I switched on the Windows Home Server machine. SUCCESS!

I have no idea why the system decided the hard drive was "healthy" and "ready to add" in one list, and later in the same panel should be "removed". If I'd followed those steps (add and remove), I suspect I would have lost all the data on that hard disc.

It is hard work creating user interfaces which don't risk user data, and don't perplex the user when exceptional conditions occur.

Saturday, April 17, 2010

Delighted by Seamless Operation

I needed a particular software utility (a hard disk partition archiving program). I was in the middle of some other work on my machine running the Debian "testing" distribution. I didn't want to interrupt my other work, but I had a few minutes to download and burn the needed software to a bootable CD.

I hadn't used Debian to burn a CD before, so I worried was going to demand I use various command line arguments with the usual reading of manuals. I was wrong.

I found the ISO image (thanks google), downloaded it with the web browser, clicked "open" from the web browser, and it offered to open it in "brasero". I inserted a CD-R in the CD drive, and brasero happily burned that CD with no complaints.

Thanks to the open source community! That was a great experience, simple, elegant, and did what I expected.

Saturday, March 13, 2010

Conversational Productivity Improvement

While sitting with a colleague at work, I was complaining that my favorite bookmark management tool (GMarks) was no longer working reliably with Firefox 3.6. He had an improvement suggestion, that I might try the same bookmark sync method he is using. I've learned that when my colleagues have found a better way to do things, I will usually benefit by trying the same technique. This was one of those cases

A Bit of History

I started managing and synchronizing my bookmarks with the Google toolbar a while ago, then migrated to using a Google notebook during the short life of that tool, and from there had moved to using GMarks because it could read and write the bookmark definitions I had on Google notebook. My bookmarks have had a long and varied life as they traveled from place to place...

He suggested that I might try "Xmarks", since that was the bookmark sync solution he'd been using and preferred for a while. I'm a "slow study", but decided I'd try it as an experiment.

XMarks installed easily into Firefox and accepted my manually entered bookmark just fine. I didn't see any obvious options to sync my GMarks bookmarks, but was willing to experiment without that feature initially. It seemed to work smoothly and well.

Chrome and Internet Explorer


I prefer to switch between browsers frequently in hopes that will lead me to discover bugs in our software more quickly. I move between IE, Firefox, and Chrome frequently. I installed the XMarks add-on to Chrome and it offered to sync my Chrome bookmarks to XMarks. That was a great offer, since Chrome had pulled all my bookmarks from the Google bookmark location where I had stored them. I now had all my bookmarks in XMarks, without entering them myself. Thanks Chrome!

The Internet Explorer XMarks add-on was just as simple to install, and just as well behaved. Now my bookmarks are the same on any browser I use, or at least any browser I use where I can control the browser add-ons installed.

HP MediaSmart LX 195 - Victory At Last

Two months ago I purchased a new HP MediaSmart Server LX195. It is a low power computer and disc drive in a small box, with Windows Home Server, an operating system intended to help with some of the common home computing chores, like:

  • Automatic backup (both onsite and offsite)
  • Photo upload
  • Media storage
  • Media streaming

The initial purchase price was slightly over $200 for a unit with a 640 GB hard disc. That was close enough to the purchase price for a comparable sized hard disc that I was willing to experiment with the system just for the value of experimentation (and hope for easier backup).

Miserable Initial Experience


The initial installation experience was poor. The computer has no display adapter so it must boot, connect to the home network, and be reachable from within that home network without any interaction with the user. The total user interaction is through the power switch (on or off) and three front panel LED's.

I was fascinated by the challenges hiding in making Microsoft Windows boot and configure itself on a network without user interaction. There are so many ways to configure a network, so many conditions which could fail, so many potential problems, and a general assumption among the Windows development teams that a Windows server always has a display.

The first time I switched on the power, the lights went red (disconcerting), then eventually glowed a yellow color. That was not a good sign. I pressed the power button to stop the machine and nothing seemed to happen. I pressed and held the power button for 5 seconds and the power went off (learning to control the box...)

The second time I switched on the power was similar to the first. I was short of time, so I left the box on for a period and did other things. It stayed in the same condition. I didn't have time to fight with the box, so I sent a message to the seller asking for instructions to return the device.

The seller was very considerate and offered to accept the return (impressive by itself, since this appeared to be a small shop selling through amazon.com). The "I'll accept the return" message also included a suggestion that I contact HP.

A week or so later, I had time again, so I went searching for the 195LX Windows Home Server support team on the HP web site. After some digging, I found a phone number (800-474-6836). I had initially been unwilling to spend any technical support time with the device, but decided I'd "try the experiment".

Considerate Technical Support


I assumed the technical support would be provided by someone lacking native language skills, and the support call would be a deep struggle. I was wrong. The technical support call was handled by a well spoken, considerate individual with clear communication skills, a cheery attitude, and no detectable accent in their speech. I appreciated that!

I didn't have much time that day, and we quickly exhausted that time before the problem was solved. However, the experience was so positive (compared to my expectations), that I decided I might try again another day.

Home Networking Complications


The second technical support call was just as pleasant an experience as the first call, and was much more successful at solving the root problem. The technical support person asked some questions about my home network configuration, and I offered some things I had observed. With the "back and forth" of that conversation, I think I persuaded him that I knew enough about networking to be interesting, and he persuaded me that he knew enough about the typical behaviors of the box that I should trust him.

The tech support person had me insert the IP address of the 195LX into the hosts file of the Windows computer I was using as the client and display for the little server. Apparently, Windows Server 2003 and Windows client operating systems did not like my attempt to use purely the IP address to configure the computer. I don't have a DNS server at home, I don't have a WINS server at home, and at the time I was not running my own DHCP server either, choosing to use the DHCP server provided by my Linksys wireless router.

Inserting the IP address was the "magic key". The computer was now reachable, and ready for configuration.

Success At Last


Once that initial installation hurdle was overcome, the rest of the experience was great. I'm a software tester type, and I expect software to disappoint me. That's the nature of software testing. In this case, rather than disappoint me, the configuration experience delighted me. It has been a long time since I felt delight like that in working with software. Thanks Microsoft!

I configured the server to backup the computer I used for initial installation. After installing a little client software and entering a password, the backup hid itself and ran very nicely. I moved to the other 3 computers in the house and did the same thing, with the same easy experience. That was already impressive, since those computers have operating systems of widely varying ages. Two older Windows XP machines, a Windows Vista x64, and a Windows 7 x64 machine were all easily added to the backup process, and safely copied their files to the server.

The server also has places for shared folders. One of my concerns has been losing our digital photo collection, so I copied files from the client computers to the digital photos folder on the server. That was easy, and well behaved. There was also a setting to allow the server to search for photos on its own (thanks!).

The same for music, although the music is a more replaceable item, since we should have the CD's for all that music anyway.

Offsite Backup


A week or two after that positive experience, I remembered that I had seen mention of being able to use the MediaSmart to send backup data "offsite" for safety. I found the configuration panel easily, and answered a few prompts for the offsite backup process. It configured an Amazon simple storage service instance for me and started pushing my data to that offsite backup location.

Several days later, I checked the status and it was still pushing my data to Amazon.

I was curious why it was taking so long, so I checked the hard disc capacity of the server. I had already consumed about 500 MB of the 640 MB disc. Most of that was backup. Apparently my computers contained more data than I realized.

The backup process eventually completed successfully, and has been running happily since then.

Searching for a Lost File


In the midst of all this fun with the server, my son lost a homework assignment. He had forgotten to save it from within Word, and was desperately trying to find any remnants of the assignment. I opened the restore wizard and was able to comfortably navigate around the files in the backup device. Unfortunately, the file was not there, but my confidence was increased that the backup might actually be usable, not just a "write only" device like most backups I've had in the past.

Adding Storage


Since the home computers were needing more backup space than I expected, I could see that I would exhaust the disc space on the server in a week or two. That was a nice opportunity to expand the disc storage. I purchased a 2 TB USB disc drive and connected it to one of the 4 USB ports on the back of the server. The server recognized the drive, and offered me choices in how I'd like to use the drive. I used it to add additional file system space to the server, and then configured the existing folders on the server to "mirror" themselves.

The "mirror" concept seems to be a "poor man's RAID", an operating system provided option that will periodically copy the contents of selected folder from its original drive to another drive on the system.

That worked well also.

Changing Passwords


One of the low points in my life is the requirement from work that I must change my password every three months. Changing my password has been a hassle on a number of computers, and I worried this would be a similar experience. I was wrong. It was an easy process. I changed my password on one of the client computers, and the Windows Home Server icon prompted a short time later that it is best if the client and server passwords match. It let me make the change, with options to force the client and server passwords to be the same (nice touch!)

Impressed


I'm thoroughly impressed with the device. The initial install experience was "rocky", but the experience with Windows Home Server has been delightful ever since that initial rough spot.

As a caveat to software creators, the initial install experience is high risk that you will lose the customer before they've even seen your product's features.

Monday, March 8, 2010

Stone Age Productivity Improvement

Sometimes we spend more effort on presentation than on content, even when the real value is in the content, not in the presentation.

Some examples:

  • Why spend time creating a PowerPoint slide deck for an informal discussion? Why not scribble notes on a page, scan the page to PDF, then annotate the hand-written PDF notes during the meeting?
  • Why spend time transcribing a white board to a computer when a digital photo of the white board captures the arcs, arrows, angles, and phrases on the board, without wasting the effort to form those into a digital document?
  • Why enter requirements into a computer system when the team is local? Use scraps of paper (3x5 cards work for me) and move them around on a table to prioritize and understand them
  • Why not perform bug triage by rapidly moving pieces of paper around on a table instead of forcing everyone in the room to focus their attention on a screen where one and only one person can make progress?
  • Why not have a face to face discussion or a phone call instead of composing an e-mail message?
  • Why not talk about an issue instead of spending time to document it?

I listened to a programmer this morning noting that he was unable to find documentation on something. The thought that kept rolling around in my head while he said that was, "what you need documented does not exist, and the waste of creating documentation for all the things in order to not be missing that one thing was just not worth it".

Saturday, February 13, 2010

Expressing My Priorities

We've had a set of tests failing on our continuous integration server for a week or more. That's a bad thing, since we tend to ignore continuous integration jobs which are not clean almost all the time. Continuous integration failures need to be infrequent and interspersed with extended periods of "clean", or at least need to be viewed as a trustworthy development information source. Otherwise, the team ignores the continuous integration server and it becomes another wasted and ignored feedback system.

The same tests work fine on the developer machine where we've run them (in a slightly different configuration). The "builder" attempts to diagnose the root cause of the failures have been unsuccessful.

I've scheduled a Monday brainstorming session with a larger group of people. We'll discuss what we know about the failures, what we know about matching successes, and then decide on a course of action.

A colleague asked why I was the one convening the meeting, instead of someone in the affected team. His logic was that several different people on the development team should have been intensely interesting in the root cause of the failure, and have "run it to ground".

I think there are several reasons why individual developers are not chasing this problem:

  • The problem crosses boundaries (only visible from an installed version, only visible when other components installed, only visible from the continuous integration machine, etc.)
  • There is no clear correlation between the first failure and a commit from a developer
  • Installation related problems are more difficult and painful to diagnose. Setup and teardown time for the test is signficantly more than the setup and teardown time for most of our other automated tests
  • Diagnosing this failure will reduce the energy available to work on other things, like new backlog items and new tests

I'm expressing my priorities by scheduling the meeting and bringing a group of people together to work on the problem. Since I'm a manager, my priorities have a little more weight, and I'm "throwing that weight around" a little for this case.

Eventually I'm confident others in the organization will recognize my focus on keeping the continuous integration servers "clean". Until then, I'll continue working with people to persuade them.

Sunday, January 24, 2010

Microsoft Mesh - Synchronizing Files Between Computers

I work with many different computers between home and work and have found that Microsoft Live Mesh has been a nice addition to my computers. Mesh allows me to specify one or more directories which should be synchronized between all the various computers I'm using.

Thus far, I've used it to allow me to write performance reviews on any of my typical computers, knowing that the written results will be synchronized with my other computers so I can continue writing elsewhere.

I'm now attempting to use it to synchronize my "3x5" cards electronically. I usually carry a stack of 3x5 paper cards in my pocket for idea and task capture. When those cards need to live a little longer than they would normally sit in my pocket, I'm writing them to a file (in emacs "org mode") which let's me shuffle, stack and move them almost as easily as I can move them on a table. With that file synchronized between my various computers, I hope to lose fewer cards, and have a more clear idea of priorities at any computer I use.