Bad code starts out as good code that grows in bad ways.
Let’s think about that one. When I originally wrote the line, I wrote “typically starts out…”. But I’m backing off from that. I will take the more optimistic approach. Code starts out good. Only after the addition of capabilities and features and whatever else does the pressure of complexity foul it up. (Even by personal experience I’m being kind; better to err in that direction.)
But when do I refactor?
Easy one, huh? Refactoring is one of those activities that’s consistently needed, possibly somewhat risky (often in horribly subtle ways) and nearly always something you wish you’d done yesterday, if not the day before.
There are two approaches here. The one that really serves the best is to refactor constantly. Every damn time. If you touch code at all, do a little cleaning, do a little segregation of responsibilities, do a little bit of semantic naming and make that naming consistent.
But, as we know, it’s not always possible. Sometimes you’re faced with a chunk of legacy code that you think works and you have to add something. And the code clearly needs at least a little bit of refactoring. But where to start? How do you avoid introducing errors, whether they be big, bad ones that make it all fall over or worse, nice subtle ones that corrupt the universe just a little bit at time until some time in the future when someone looks at things and says “This can’t be right!!! How long has this been going on?!?”
And this is often the place where we get stuck. The little angel of optimism on one shoulder says, “with a series of well understood transformations, we can produce knowably equivalent code that’s clearer and then make the changes we need”. Of course, the angel of caution and pragmatism sits on the other shoulder just as eloquently, saying “we really have no clue as to what’s going on here, but we can add the little bit of functionality we need in as isolated a way as possible and live to code another day! After all, we won’t know any less about how any of this works”.
Because of the fact that in each individual case the second option appears to be locally more prudent (and ‘locally’ is typically all we have) it’s what we do. And we accept it because the increase in uncertainty at each juncture is bounded, lulling us into a (perhaps) false sense of security. We have limited the blast radius (we think). And while it may not all be OK, if it is worse, it will only be by a little.
And this is not a criticism of behavior. But it does point up a fault in the process. Specifically, it points up the lack of adequate tests, specifically tests that can spot regressions. (Are we talking unit tests or are we talking integration tests? Well yes, we are. Ideally, of course, unit tests being intrinsically simpler we’d like to capture possible regressions at that level, but it’s not always possible. And, in any event, it’s a subject for a different post.)
The important thing, though, is that without adequate testing in place, you’re flying blind. But with adequate testing in place you can refactor. Mercilessly!
And that keeps good code from going bad. Don’t DEVO.