It is nearly impossible to write bug-free software of any significance.
A software bug exists when a program doesn’t do what it’s intended to do. That’s different than a poor design that makes simple tasks overly difficult or a missing feature that makes it impossible for you to accomplish a task. Some bugs are simple, like an annoying visual glitch. Some are nasty, like a program that fails to save your data or, worse, corrupts it. And some are crashes, where the program just exits abruptly or hangs, meaning the program continues to occupy your screen but is completely unresponsive. You’ve probably experienced one or more kinds of these software bugs—and cursed the nameless programmer responsible for it.
Right now, Cryptic Studios (my employer) is finishing up work on a release of its popular Neverwinter game for the PlayStation 4 game console (it’s free to play, by the way—check out playNeverwinter.com). Before release, we want to eliminate as many bugs as we possibly can. There are “showstopper” bugs that cause the program to crash, hang or make it impossible for players to complete sections of the game. These bugs have to be fixed. Because I’m new to the company, I’m frequently in the position of trying understand and fix a problem I didn’t create. That process, known as de-bugging, entails isolating and fixing bugs or removing the them from the software (no one intentionally practices “bugging”).
I’ve written about debugging before (“Wolf Hunting,” December 2014), where I talked about debugging techniques, like the use of assertions, and isolating a bug with the “wolf and fence” method. But since I’ve been doing a lot of debugging lately, I wanted to revisit the subject and talk a little about how programmers try to avoid creating bugs in the first place.
It’s nearly impossible to write bug-free software of any significance (some will argue that it is impossible). The on-board software for the space shuttle Discovery came very close; the final version of its 420,000-line program reportedly had only one error. But the 250+ people maintaining the software (many of whose only job was to find errors) represent a huge expense that isn’t warranted for most software. With rare exceptions, flawless software costs too much and takes too long to develop to be economically viable.
You might think that, as bugs in a piece of software are fixed, gradually the number will approach zero. But fixing a bug often involves adding code (instructions to the computer), which itself may have bugs. Like I said, it’s harder that it looks to write software of any size that’s bug-free. Then again, lots of software that’s used successfully every day has bugs that don’t seriously impact its usefulness or reputation.
Creating less-buggy code starts with the notion that the simpler a piece of software is, the easier it is to make it bug-free, although even this rule has exceptions. A widely-used “binary search” function in the software library that’s part of the Java programming language was discovered to have a bug after nine years (when it caused someone’s program to fail). The size of that search function? A whopping 17 lines of code! So, simple code—short enough to be viewed on a single screen, containing no complicated logic to understand, perhaps even with useful comments—is not, by itself, the answer.
A more recent innovation is the idea of test-driven software development. You start by writing software that tests the functions of the software you’re actually trying to write. For example, if you were writing a function that returned the length of a string of characters, you might start by writing a test that gives your function “0123456789” and ensures that it returns the value 10. Then you write the function and ensure it passes the test. Any time you change the software, you re-run all the tests to make sure that your change didn’t inadvertently introduce a bug.
Any code involving a condition, like “If the user’s balance is less than the amount of the check, then…,” creates two paths through the code (one where the condition is true and one where it isn’t). Well-written tests can ensure that all the paths through a piece of code are taken at least once, reducing the likelihood that a bug is hiding in one of them.
If the software passes your tests but then fails in actual use (a bug!), you then write a new test that demonstrates the bug and fix the original software so the new test passes. Having a set of tests that give you confidence that older code continues works as expected as you add new code (for new features or to correct problems) is a real help.
Of course, writing tests takes time and effort—and they have to be meaningful tests. They have to be easy to run, and, in fact, the best approach is to be running them all the time.
I’ll talk more about “continuous integration” next month.
Is there a bug in my column? Let me know at mduffy@northbaybiz.com.