Immutability, Purity and All Kinds of Stuff FTW (2017-07-14)

Immutability

I’m going to show you a little bit of code (don’t worry about the language, it might not exist)… Then I’m going to ask you a question, OK? Cool!

    x = 1
    y = 2
    ...
    ...
    ...

(Yes, I’m not going to show you what’s there. This is my game and I can do what I want.)

    if (x != 1) {
        destroyWorld();
    }

And yes, it’s all within the same scope. There are no branches or returns or exceptions thrown; there’s nothing that will keep the conditional from being executed.

So I ask a question made famous by an ex-cowboy actor on the streets of a fictional San Francisco a bit over forty years ago:

“Do ya feel lucky, punk?”

Well, not so much. Certainly not in the languages we use or in the way we use them. At this point in the code:

    i

could be anything at all.

And that’s the thing: In an immutable or, at least, a single-assignment you wouldn’t have to feel lucky. You’d know that the ‘destroyWorld’ function was not about to be called. You’d know that the value 1 was bound to the name ‘x’ and that was that.

Hell, I feel safer already.

“But,” you ask, “If I can’t change anything, how do I, like change anything?”

Well, you don’t. When you need one that’s different, you make a new one.

“Wouldn’t that be wasteful?”

Well, maybe, but not necessarily. When you make a new one, you don’t have to make a whole new one. You take the changes, say ‘and everything else is like this‘ and point to the old one.

“But what if the old one ch…. OH…..”

Yup. You’re starting to get it.

Purity

Functions, functions, functions. Everywhere we look, there’s a function. Well, kinda.

When we think of functions from mathematics, they have this nice property: Every time you call the same function on the same arguments, you get the same answer. So 1 + 1 is always 2. That’s the way functions are defined; they depend only upon their arguments. It’s the property of being referentially transparent. Same input? Same output. Always.

In programming languages, however, things are not that simple. The value returned by a function might not be completely determined by its input. Class methods, for example, have an additional scope that might contribute to the value returned by a function, the values of class variables. Instance methods have the values of class variables andinstance variables. And closures have whatever free variables are available in an enclosing scope, either implicitly (in, say, JavaScript or Python) or explicitly (in, say, PHP). And sometimes the source is even more external – like the state of the universe!

time()

anyone?

Of course, it gets even more complicated than that. Functions can have side effects. As opposed to just returning a value like nice tidy mathematical functions, they can go behind your back and change the state of the universe or at least some more limited enclosing scope. So you call

grabMyPhoneFromTheOtherRoom()

and you go back in there later and discover that My Mother the Car, dubbed into Basque, is playing on the TV, the guitar has been returned to an open D# and al the books on the shelves have been rearranged, sorted by the second letter in the author’s names.

When a function behaves like a mathematical function, taking arguments and returning a value and not messing with anything else, we call it a pure function. Pure functions are cool! They’re easy to reason about! They’re easy to test! You throw arguments at them and they give you an answer – and you can check that answer! And if you ask them again, you’ll get the same answer again. And they won’t leave a mess. No muss, no fuss.

“So we don’t want side effects, right?”

Well, it’s not quite that simple. We want side effects a lot of the time – you know, output! Persistence. Stuff like that. We like programs that do things. But it’s exactly that doingof things that makes code harder to reason about, harder to test. Why? Because we often have to do a lot of work to build up an environment in which those side effects can take place. How many times have we written the “create a bunch of objects…call some methods on the objects to get them in the right starting state…then call the method we want to test…then call a bunch of methods and collect their results to see if what we wanted to happen to the state of the objects involved actually did happen to the state of the objects involved… And then, we throw it all away and start over!

The worst part is that we do all this by writing (expressly non-trivial) code – WHICH WE KIND OF DON’T FULLY TRUST IN THE FIRST PLACE!!!

So, while we can’t get rid of side effects (well, we can but it’s a considerable undertaking), we can limit them and segregate them. We don’t have to smear that uncertainty about a program’s overall state throughout all the code.

Most side effecting code is really quite reasonable. Aside from the aforementioned changes to the outside world, we often induce state changes in objects because:

  • We can’t just create a new one because the object is in scope to other code.
  • The change is dependent upon a lot of properties that make up the object’s current state – and who wants a dozen variables in a parameter list?

Even some of these cases can be resolved with design. Wide and flat can be your worst enemy.

In general though, one of the great advantages of pure functions is that they’re easier to test. And the tests themselves are much more likely to be testing the right things the right way.

Types anyone?

Types are cool. Typing is cool, too.

Wait? What?

“We prefer dynamically-typed languages because we were damaged by our experience with Java.” (This effect is considerably more acute among those who used Java before the introduction of generics.) “Types restrict you too much.” “I don’t like programming in B&D languages.” “Statically-typed languages are not as expressive.”

But whether we’re looking at going in a statically-typed direction (unlikely) or a gradually-typed direction (could be) or a predicate-based system (see http://learnyousomeerlang.com/dialyzer or https://clojure.org/guides/spec) it’s all about being able to provide more information about a program within a program. It also gives us a way to reject programs that are not coherent at a point before run time (which is usually a better time for it to fail). And it saves us from having to write a whole tier of tests we’d otherwise have to write.

And with a good type system, you might even get more information about the pieces of data you’re working on.

So…

All right. We’re not going to change our stacks radically any time soon – nor should we (necessarily). But we could use some techniques that are available, no matter what the stack looks like, to make things a little bit more reliable, a little bit easier (and hopefully quicker) to work on, and certainly easier to reason about. We can do it a lot or do it a little, anything should make us better. And that’s what matters.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s