Struggling with unit tests (what happened next)

This is going to be a fairly uninteresting and hard to follow post, but its here for the record, I’ll try to distill something better out of it. This was a sort of as-it-happened log of my initial attempt following on from my plan earlier today. [Also I fail at Live Writer, and I overwrote this post, instead of creating a new post. And then undo that. D'oh.]

During refactoring for testability of key logic, block coverage sometimes changes a little before you write the test!

Often the change is small, and often it is initially in the wrong direction.

Here I picked what I thought the most fragile, testable, and long-living piece of logic would be, and decided to initially pull it out to be a static function.
Before refactor: 87.78% coverage of project not covered
After refactor, before writing a test: 88% of project not covered [net -2%]
After writing a test: 83.34% of project not covered [net +3%]

After that, I felt like it was time to see if refactoring for testability introduced or exposed any 'smell' elsewhere.

But... refactoring itself can expose bugs in my understanding of the code.
Before refactoring I had believed I was generating batches of 100 events where each batch had a unique ID. After, I realized I was actually generating batches of 100 events, where each batch had the SAME id as each OTHER batch in this set of generated batches.
So what was this batch ID actually used for? I mean clearly it's not useful for identifying or locating a single batch...
Perhaps Batch ID isn't really used for ANYTHING. It's generated, and persisted in our database, but never ever read again. Really??
That can't be right can it? And I thought our database had a unique constraint on batch ids...???? In fact it's a primary key right?
Um.. yes. It's a primary key.
But yes, are we really generate batches with duplicate IDs? What the hell?
I told all my teammates I had found a bug!
My teammates politely told me that no, that scenario had been tested.
At which point I realized that I had not found a bug... in fact I had *introduced* a new bug while refactoring for testability.
I had been believing the ID generation function was a 1:1 function, butactually it was not a function at all - in the mathematical sense.
It is a GUID concatenating function that always generates unique values (A mathematical 'function' should have at most one single well-defined output value for any input. Like SQRT.)

And thus sanity was restored.
I fixed the new bug I had introduced.
This also led me to write a few more tests.

The new tests provide no additional code coverage at all... but they verify the bugfix, and they help formalize the notion of what a batch ID is!
Our coverage takes a net small step backwards during the final bugfix. 83.54% uncovered, or 16.46% covered if you prefer.

Finally it was again time to take stock of my actual change and see whether the new code was overall feeling 'better' or 'worse'.

Moved: some code intimately related to BillingBatch class was moved over to the BillingBatch class. That's probably a win.
Separated: a glue 'data conversion' function became separated from the task of labelling records with batch IDs and became more of a pure and simple data conversion function. Separation of concerns. I call that a win.
Inlined: a rather trivial function that was called in two places since it is now called only in one place.
Overall feeling better or worse: just slightly better.

However...
suddenly I see things more clearly!
One of my other classes is basically just a factory, that turns configuration, into dependencies. It's not a 'mainloop' at all.
Refactoring commences, and suddenly, I've eliminated a bunch of blocks from my codebase and code coverage is still at... 16.4%. What the heck.

But in the process I've now got a BillingOperations class with injectable dependencies that can itself be tested with mocks. Progress, sort of!

So since yesterday I’ve gone from having one BillingMainLoop class, which had a mixture of responsibilities around knowing data persistence details, knowing how to
iterate through events and group them into batches, and knowing how to take the app configuration and instantiate/configure all sorts of necessary dependencies, to having a much more specialized division of responsibilities:

-Batching class – knows how to group events into batches, but not what to do with batches
-BillingOperations – has a bunch of dependencies for getting events [input], and writing batches [output]. Delegates batching work to the Batching class.
-BillingOperationsFactory – creates BillingOperations given the app configuration.

I am not a general fan of factory classes (surely a method would do the job?), but this iteration of the design does feel like it’s making it easier to think about how to test the code.