How to make ANY code in ANY system unit-test-friendly

[I added this example in a later post]

There are lots of pieces of code that are embedded in places that make it very hard to test.  Sometimes these bits are essential to the correct operation of your program and could have complex state machines, timeout conditions, error modes, and who knows what else.  However, unfortunately, they are used in some subtle context such as a complex UI, an asynchronous callback, or other complex system.  This makes it very hard to test them because you might have to induce the appropriate failures in system objects to do so.  As a consequence these systems are often not very well tested, and if you bring up the lack of testing you are not likely to get a positive response.

It doesn’t have to be this way.

I offer below a simple recipe to allow any code, however complex, however awkwardly inserted into a larger system, to be tested for algorithmic correctness with unit tests. 

Step 1:

Take all the code that you want to test and pull it out from the system in which it is being used so that it is in separate source files.  You can build these into a .lib (C/C++) or a .dll (C#/VB/etc.) it doesn’t matter which.  Do this in the simplest way possible and just replace the occurrences of the code in the original context with simple function calls to essentially the same code.  This is just an “extract function” refactor which is always possible.

Step 2:

In the new library code, remove all uses of ambient authority and replace them with a capability that does exactly the same thing.  More specifically, every place you see a call to the operating system replace it with a call to a method on an abstract class that takes the necessary parameters.  If the calls always happen in some fixed patterns you can simplify the interface so that instead of being fully general like the OS it just does the patterns you need with the arguments you need. Simplifying is actually better and will make the next steps easier.

If you don’t want to add virtual function calls you can do the exact same thing with a generic or a template class using the capability as a template parameter.

If it makes sense to do so you can use more than one abstract class or template to group related things together.

Use the existing code to create one implementation of the abstract class that just does the same calls as before.

This step is also a mechanical process and the code should be working just as well as it ever did when you’re done.  And since most systems use only very few OS features in any testable chunk the abstract should stay relatively small.

Step 3:

Take the implementation of the abstract class and pull it out of the new library and back into the original code base.  Now the new library has no dependencies left.  Everything it needs from the outside world is provided to it on a silver platter and it now knows nothing of its context.  Again everything should still work.

Step 4:

Create a unit test that drives the new library by providing a mock version of the abstract class.  You can now fake any OS condition, timeouts, synchronization, file system, network, anything.  Even a system that uses complicated semaphores and/or internal state can be driven to all the hard-to-reach error conditions with relative ease.  You should be able to reach every basic block of the code under test with unit tests.

In future, you can actually repeat these steps using the same “authority free” library merging in as many components as is reasonable so you don’t get a proliferation of testable libraries.

Step 5:

Use your code in the complex environment with confidence!  Enjoy all the extra free time you will have now that you’re more productive and don’t have bizarre bugs to chase in production.