There are only two hard problems in CS:
- Naming things
- Cache invalidation
- Off by one errors
We’ve gone great lengths to mitigate the effect of the last of these. Rarely do we use index-based, perhaps nested, loops to traverse the elements of an array. More often we’ll use a
foreach construct or a comprehension or a map or whatever. It’s an improvement, it’s less error-prone and it reflects the semantics of what we’re trying to accomplish.
Cache invalidation in our increasingly massively concurrent distributed age remains a problem – and a significant one. We tend to either err on the side of synchronization (old-style) or we just accept the notion of eventual consistency, accepting the idea that it can be worthwhile to trade the pure assurance of linearized operations for a little bit of timeline uncertainty and gain much speed (and, at least sometimes, simplicity) in return.
And then, perhaps the most fundamental scourge of them all: Naming things!
Yes, naming things is hard. And yes, there exist approaches like point-free style and the use of long sequences of anonymous callbacks to avoid doing it. There are a lot of instances where that approach makes sense: intermediate values, classic continuation-passing-style sequences and so on. And we’ve all (well most of us) have lived through various naming conventions that encode all sorts of information about an entity in its name; Hungarian without any particular Magyar influence. And don’t get me started about pattern-based naming conventions; ‘FixtureEntityFactoryManagerFactoryFactory’ may provide a lot of information but it’s unreadable – especially if it shows up more than once within a single field of vision.
But it’s still important to name things. Both values and functions. (No, not every single one. Of course not. But often more than we typically do.)
But how should we name things? Ah. It’s time to be controversial. While using a consistent naming format (‘getThingFromSomewhere’, ‘setThingToSomewhere’) is good from one perspective (leveraging an IDE, for example) it can be hard to read from the standpoint of having everything look alike. It helps when the semantics stand out. Objects have behavior associated with them; they are not just bags into which we put data bits (well sometimes they are, but bear with me). Naming behaviorally can greatly aid in understanding the intentions of a piece of code; one is typically much less likely to get lost in a landscape with distinguishable features! Consistency is more likely to be a hindrance than a help!! Exclamation points can be annoying!!!
(This is not to say that consistency with regard to how a piece of data functions is a bad thing; calling the primary key of a table called whatever “whatever_id” is valuable to be sure. But calling the data items in the table ‘whatever_this’, ‘whatever_that’ and ‘whatever_the_other_thing’ is probably visual pollution.)
So how should we name things? Name things for what they are. Name pure functions/methods for what they return. Name impure functions/methods for the side effect they have on the environment. Named things are worthwhile because they encapsulate semantics. Named things can exist in the (more limited) solution domain as opposed to the programming domain (which, by being more general, is more complex). It’s the power of abstraction – which, really makes it all work. There’s a reason we don’t write so much assembler these days.
So names are important. Make them descriptive. Make them convey actual information about the domain they’re serving. Make them distinguishable.
And use them.