Functional Programming Trick: How to Mutate State While Preserving Referential Transparency | by Ivelina Yordanova | May, 2022

Your code should read like a story

Photo by Roman Kraft on Unsplash

The thing you’ll read about functional programming (FP) in most places is “Mutating state and side effects are not allowed“. When you read that over and over again you are left with the impression that FP is quite limiting and not really applicable in all cases. Most real world applications involve mutations in some shape or form and often you need IO or other operations which classify as side effects in FP.

However, there is a secret to this that you need to dig deeper to find…state changes are actually allowed…shocking, I know! It requires you to think a bit more about the design and you need to meet certain criteria but it is possible! Let’s see how it can be done.

First, let’s make a subtle but very important distinction between effects and side effects. An IO operation, for example, can be used in a pure program to have an effect on the outside world without breaking referential transparency. This can be achieved by designing your program so that it reads like a story. What do I mean by that? In a story, you have actors, each with their own set of lines and movements to execute, no 2 actors have overlapping purposes, none is redundant and no single actor plays multiple roles.

In a program classes are your actors and methods are the transcript for their role. As you design your application you should consider all the roles in play and all the words they are going to use. As you write you are slowly creating your own domain-specific language (DSL) and even a non-programmer should be able to get what the code does on a high level. I know many developers take pride in coming up with solutions that are practically unreadable but do the job in one line, but this is as dangerous as it is unnecessary. It’s a cool mental exercise, a brain tease, but unless you are forced to optimize given limited resources (whether that’s memory, CPU or other specific limitation) it’s better to choose readability, testability and maintainability over that.

Let’s take a look at a very simple example of how to turn a non-pure function with side effects into something acceptable in functional programming by pushing the side effect to the outer layer and making everything else pure and using more domain-specific language.

fun printHousePriceStats(houses: List<House>) {
val prices = houses.map{ it.price }
val msg = "Average price: ${prices.average()}, standard deviation: ${prices.standardDeviation()}"
println(msg)
}

This is an over-simplified example where I attempted to very obviously stick all the parts I need. So, when writing or refactoring functions with effects in your code you should always think of them as 3 separate parts:

  • a function providing input to the pure core function
  • a pure core function
  • a function that outputs the result of your

In our amazing example this will translate to:

fun getPrices(houses: List<House>): List<Double> {
return houses.map{ it.price }
}
fun getStatsMessage(housePrices: List<Double>) {
return "Average price: ${prices.average()}, standard deviation: ${prices.standardDeviation()}"
}
fun printHousePriceStats(houses: List<House>) {
println(getStatsMessage(getPrices(houses)))
}

These are all one-liners but the concept is the same as if you imagine having complex computations for getPrices and getStatsMessage . This way, you can test the 2 individually and push the printing as further to the top as possible, leaving nothing in the printHousePricesStats function to test (you wouldn’t test a standard language library function).

To claim your program is fully functional though you need one more step — wrap the final outer layer with effects into an object returning an executable instead of actually executing the action.

interface IO { 
fun run(message: String)
}

Then, printHousePriceStats to:

fun printHousePriceStats(houses: List<House>): IO {
return object : IO {
override fun run(): = println(getStatsMessage(getPrices(houses)))
}
}

For this super simple example this certainly is overkill but for a complex case where you want to be inline with all FP principles but still have the effects and test it all — you can go even a few steps further by adding classes that would execute map, flatMap and other required operations on the IO if you need or want to go wild on the monad train.

A more useful concept of referential transparency actually allows you to use mutating state locally inside an expression as long as you can guarantee that no other part of the program can “see” it.

Some examples of algorithms where the local mutation is a must are the matrix transposition, insertion sort, quick sort or just reversing the order of an array in place.

If that mutation is only contained within the scope of that one function without exposing any details about the implementation to the caller, then it’s considered pure. Still not sure about this? Let’s go to the definition of RT (referential transparency) and purity:

An expression is referentially transparent if all its occurrences in a program can be replaced with its evaluated result without changing the meaning of the program.

A function is pure if it is referentially transparent for all referentially transparent arguments.

It’s not possible for any caller to know that reverseArray here is using another function that mutates its state. Even though swap isn’t pure, reverseArray remains referentially transparent because the rest of the program can never actually get a hold of the mutable array.

fun reverseArray(arr: IntArray): IntArray {
val n = arr.size
val revArr = arr.copyOf(n)

fun swap(x: Int, y: Int) {
val tmp = revArr[x]
revArr[x] = revArr[y]
revArr[y] = tmp
}

for (i in 0..(n/2-1)) {
swap(i, n - i - 1)
}

return revArr
}

Any function can use side effects internally and still maintain a pure external interface to its callers. We can make the argument for this and prove it verbally but that’s a very error-prone approach — what if someone new comes along and changes the code in a way that exposes the revArr array?

To enforce those rules, ideally, we would make it obvious for other developers… and ourselves when a change is breaking those. We need to use the compiler’s help for this by designing our program in a way that it will not even compile if the FP principles are broken.

For this purpose, we can go back to the story analogy. Let’s say we are sorting out some IDs of type Int and the array of ids is actually a collection of products, so we can wrap up the reversing function in an object ProductIdReverser and the function itself can accept a ProductIds type or Array<ProductId> Depending on what’s more fitting into the overall design. As you probably see, we are attempting to build a language around the primitive and built-in types in a way that would make our program not only easy to read but hard to break since a lot of bugs could be caught at compile time.

This could be a great first step if you are attempting to migrate a legacy code into a more functional one — start by extracting out methods, variables and defining types. In some cases, there’s an argument even for defining simple wrappers around Strings. Imagine taking on a program where you deal with multiple entities, each having ID, for example, machines on-site moving products. There you’ll have to use IDs for the machines, site locations, and products and you need all those to send off the right command for the right machine. In this case, you might have something like:

fun doX(machineId: String, locationstartId: String, productId: String, locationEndId: String) {
// build and send off the command
}

At first sight, that might seem sensible and OK, it will do the job. However, this is a nightmare bug waiting to happen. Even if this is well documented, the caller will always only see the 4 strings expected and it’s incredibly easy to pass them accidentally in the wrong order. Even if that’s tested, imagine having hundreds of places in the code where you have something similar…mistakes will certainly be made. There are few ways to avoid this but none is more robust than having a type hierarchy for your IDs:

sealed class MachineId(id: String) {
class MovingMachineId(id: String): MachineId(id)
class PackingMachineId(id: String): MachineId(id)
class LoadingMachineId(id: String): MachineId(id)
}

sealed class ProductId(id: String) {
class SmallProductId(id: String): ProductId(id)
class MediumProductId(id: String): ProductId(id)
class BigProductId(id: String): ProductId(id)
}

sealed class LocationId(id: String) {
class StationLocationId(id: String): LocationId(id)
class DockLocationId(id: String): LocationId(id)
class PathLocationId(id: String): LocationId(id)
}

This, of course, will not be fitting for all cases but could very well save you a lot of headaches and…a lot of time in terms of testing, debugging and onboarding people.

There is a great book on “Functional Programming in Kotlin” from Marco Vermeulen and others. The examples are in Kotlin but it goes into a lot of details about theory in general so if you are not a Kotlin expert or even a fan you can still get it all and learn a lot.

Thanks for reading and happy refactoring 🙂

Leave a Comment