Thursday, February 26, 2015

A beacon of light in the shadow of failing builds

As long as I can remember, I've been using an automatic build system to regularly verify the quality of the code based I've been working on. First using Team Foundation Server, but since a year or so, Jetbrains' Team City. In my current project we use continuous integration builds for running the 12000 .NET and JavaScript unit tests, builds for delivering NuGet packages, builds to deploy the latest version of our system on multiple virtual machines, and builds to verify that the Oracle and SQL Server databases can be upgraded from one version to the other. If one of these fails, we usually don't need a lot of effort to track down the developer that needs to fix it.

But we have another set of builds that never get the same amount of love the normal builds get; our UI automation tests. Since our system is web based, we've been investing a lot in automated end-to-end browser tests using WaTiN and SpecFlow. In spite of our efforts to build a reliable test automation framework on top of that, stability remains a big issue. Sometimes the timing of such a test is an issue, sometimes it's an issue with the combination of browser itself and the WaTiN, and sometimes it is just a temporary infrastructure problem.

Regardless, having developers actively monitor and analyze those builds is painful to say the least. We encourage developers to either configure an email notification or install TeamCity's tray icon to get notified about failing builds. We even put up two big 40 inch TV monitors on the wall displaying the build status. Still, we regularly observe builds that have been failing for an hour or so without anybody noticing that. We tried to introduce a code of conduct, talked many times to the scrum masters, attended retrospective meetings to get feedback from the teams and organized department-wide intervention meetings. It usually helps for a couple of weeks, but then the routine kicks in again.

clip_image001

I don't remember where, but at some point I read a post on some agile blog where they used a real-life traffic light to signal the build quality. They used red and green for the fail/pass status and yellow to signal that a failing build is being investigated. So when I started looking for something similar, I ran into the company Delcom which ships coffee cup-sized LED based USB visual indicators that you can stick to any surface. And depending on the model you buy, it even gets a button and a buzzer.

Since we use TeamCity, I quickly found a little piece of open-source software, TeamFlash, that allows you to connect the two together. Unfortunately, it was kind of a one-time thing, and looking at the commit history, not much has happened in those 2 years. But the code was on Github, so I decided to become an active contributor and submitted my first very small pull request. Short story short, it took me several tweets, a couple of direct messages and a month of patience to get my pull request accepted. Considering this was a pretty important thing to me, and forking would not make a lot sense, I convinced myself to reboot that project.

I named it Beacon. When you run it, it will do things like turning orange when a failing build is being investigated.

clip_image002

It is available as a ZIP file on Github or as a Chocolatey package that you can install using choco install beacon. Its functionality is a bit rudimentary right now, but I've planned several improvements for the coming weeks. Being able to use TeamCity guest accounts and fine-grained control over the colors, power levels and the flashing mode are just a few of them. If you can't wait, intermediate release candidates will be available through MyGet. And since this project is open-source, feel free to provide me with ideas, feedback or even pull requests.

So what do you do to get your teams to care about your builds? Let me know by commenting below or tweeting me at @ddoomen.