Exploring the lambda calculus

February 27, 2021

The lambda calculus (or λ-calculus) is a formal system with fascinating ties to modern computation, particularly functional programming. Let’s take a look at how it works and why it’s so interesting.

In mathematical logic, a formal system is a set of symbols and rules for manipulating them, such that a symbolic expression can be transformed according to various “legal moves” in a way that preserves logical equivalence. In the lambda calculus, those symbols and rules amount to a tiny language for building and invoking functions. (There’s no connection to the standard sense of “calculus”—the lambda calculus uses “calculus” for “formal system.”) Its syntax can express:

  • Variables, like x.
  • Function definitions, like λx.y: a function (denoted by the λ symbol) that takes a variable x and returns a variable y. (In JavaScript, the equivalent would be x => y.)
  • Function invocations, represented by adjacency: yz invokes the function y with the argument z. You can use parentheses to specify the order here: λx.yz is equivalent to x => y(z) in JavaScript, while (λx.y)z is equivalent to (x => y)(z).

And that’s the whole syntax! From there, the core transformation is the intimidatingly named but fundamentally simple process of β-reduction, whereby functions (when invoked) replace variables with the values passed in as arguments. (λx.x)y, for example, reduces to y, as the x variable in λx.x is bound to the value y and the function body returns that same value. In the λ-calculus, all functions take single arguments: to handle multiple arguments, functions must adopt a curried signature, returning additional functions with closures over previous arguments. λx.λy.xy, the equivalent in JS of x => y => x(y), lets you deal with multiple arguments x and y—you just have to provide them one at a time, like ((λx.λy.xy)a)b, which reduces to the function invocation ab.

While the process follows strict rules (in the style of a computer program), you can reduce λ-calculus expressions by hand, just as you can evaluate mathematical functions either by hand or with a calculator. The lambda calculus is a computational system, but it was invented before computers and doesn’t rely on them to work.

The lambda calculus is, in fact, a Turing-complete system: a universal model of computation that can express anything computable. It was introduced in 1936 by the logician Alonzo Church, in a paper proving the unsolvability of a famous problem in mathematical logic. Mere months later, Alan Turing—without learning of Church’s work—proved the same conclusion using a radically different and ultimately more famous model of computation: the Turing machine, which manipulates symbols on a tape. (Turing was initially frustrated to have been beaten to the proof by Church, but later intrigued by the fact that such different systems yielded the same result. Together, the two introduced the Church–Turing thesis, which states that all computable expressions can be computed with a Turing machine.)

Turing’s machine, initially a thought experiment, would go on to inspire modern computer hardware: its tape prefigured the stateful memory systems of modern computers, storing intermediate values as a process runs. Church’s system would go on to inspire the Lisp family of programming languages—in particular, Scheme, which implements the lambda calculus as a subset of its syntax.

While all λ-calculus expressions correspond to valid expressions in Scheme, this is true of many languages—including JavaScript, as we see above. But Scheme is an unusually close cousin, going as far as to use the keyword lambda to denote unnamed functions. Scheme is actually a much friendlier context in which to explore the lambda calculus, since it lets us run code. We’ll switch over to Scheme for the remainder of the post, limiting ourselves to the strict subset of language features that map to the lambda calculus: unary function creation and invocation. 1

Here you might wonder: how on earth can a language be Turing-complete without things like booleans and if-statements? For that matter, numbers? How can we use the λ-calculus to get anything done?

As we’ll see, we can build all the language features we need out of nothing but this radically tiny core.

Church numerals

Section 2.1.3 of Structure and Interpretation of Computer Programs introduces the idea of representing numbers with functions. Here are a few such numbers, called Church numerals:

(define (zero f)
    (lambda (x) x)) ; f is never invoked

(define (one f)
    (lambda (x)
        (f x))) ; One invocation of f

(define (two f)
    (lambda (x)
        (f (f x)))) ; Two invocations

Note how this works: a Church numeral is a function with the signature f -> x -> any, where f is an arbitrary function, x is an arbitrary initial argument, and any is the return value of calling f with x zero or more times.

And note that the zero-ness or one-ness or two-ness has nothing to do with the values of f, x, or the return value. In a sense, Church numerals are adverbs instead of nouns: not one but once. Do something (f) with an initial value (x) once. In the lambda calculus, where the only real primitives—functions—are verbs, we’ll see this “adverbs instead of nouns” pattern come up repeatedly.

For convenience, let’s add a helper to map these Church numerals back to integers. This’ll run right in the Scheme REPL:

(define (to-number n)
    ((n (lambda (x) (+ x 1))) 0))

(to-number zero) ; => 0
(to-number one) ; => 1
(to-number two) ; => 2

You’ll note that the arbitrary function f here is an incrementer, and the initial x is 0—it’s easy to get the wrong impression here and think that the system only works because f and x involve numbers themselves. That’s not the case (remember: the number-ness is in n), and we can show that by imagining alternative helpers like:

(define (to-asterisks n)
    ((n (lambda (x) (string-append x "*"))) ""))

(to-asterisks zero) ; => ""
(to-asterisks one) ; => "*"
(to-asterisks two) ; => "**"

(Useful in its own right, but to-number is much more intuitive—we’ll use it to check our work as we proceed.)

OK, so we can squint at Church numerals and think of them as numbers. But how do we use them? How do we do arithmetic?

Here’s another example out of SICP:

(define (increment n)
    (lambda (f)
        (lambda (x)
            (f ((n f) x)))))

The function increment expects a Church numeral (n) and returns a new one (note the familiar signature (lambda (f) (lambda (x) ...))). It takes the number n (in adverbial form, “do something n times”), and then does that same thing once more. To pull that off, it first builds whatever the return value would have been without any addition: ((n f) x), which fully satisfies the signature of the original numeral n. Then it pipes that value one more time through f. If n calls f (say) three times, (increment n) makes it four. Let’s confirm that it does what we expect:

(define three (increment two))
(to-number three) ; => 3
(to-number (increment three)) ; => 4

With that example in mind, let’s tackle a problem from SICP: implementing a generic add function. Because we’re dealing with unary functions, we’ll take the two input numbers as separate arguments, for a signature like num -> num -> num, which expands to num -> num -> f -> x -> any. Without further ado, here it is, with n and m as the two numbers to be added:

(define (add n)
    (lambda (m)
        (lambda (f)
            (lambda (x)
                ((m f) ((n f) x))))))

OK, this is getting dense! What’s going on here?

  • As before, we start by evaluating ((n f) x): the original (pre-addition) return value from n
  • But where increment passed that value into f (for exactly one more invocation), here we pass it into (m f)
  • (m f) returns a function that calls f m times. Note that it still needs an x value, though—and that’s exactly the value provided by ((n f) x)
  • So, we invoke f n times as we evaluate ((n f) x), then call it m more times as we pass that result into (m f)

Let’s double-check our work:

(to-number ((add two) three)) ; => 5

We’re doing arithmetic!

This is a wildly unintuitive way of thinking about numbers, impractical on its face (though I did make an eslint plugin if you want to enforce it within your grateful team). The point isn’t that it’s a superior or even adequate representation. The point is that, incredibly, it works: a first step toward building a Turing-complete programming language out of nothing but functions.


Let’s go beyond SICP and implement some more arithmetic. Multiplication is a natural next step. While addition boils down to “invoke f n times and then m more times,” multiplication boils down to “invoke f n times, then take that whole process and repeat it m times.” Since our whole notion of numbering is based on repeated function calls, we can easily achieve that second-order repetition:

(define (multiply n)
    (lambda (m)
        (lambda (f)
            (lambda (x)
                ((n (m f)) x)))))

(to-number ((multiply two) three)) ; => 6


  • (m f) returns a function that’ll invoke f m times.
  • (n (m f)) returns another function that’ll invoke (m f) n times.
  • ((n (m f)) x) kicks off the second function, invoking f n * m times.

A fun observation: all we’re doing here is composing n and m, in that the result of (m f) is passed straight into n. So, we can simplify multiplication with a canonical tool for composition: the B combinator in combinatory logic, a related model in early computer science. (While the lambda calculus shows that we can represent all computation using an arbitrary set of anonymous functions, combinatory logic goes further by showing that we can do so using a very small set of “combinators”: higher-order functions that reference only their arguments and previously defined combinators.) Sure enough, the B combinator works too:

(define (B x)
    (lambda (y)
        (lambda (z)
            (x (y z)))))

(to-number ((B two) three)) ; => 6

Note that, while the multiply definition has four levels of nesting, B has only three: it accounts for n, m, and f, but where is the x for an initial value? As it happens, the two representations are equivalent, with the latter written in the implicit “point-free” style. Look at just the return values of the two:

; Return value of `multiply`:
(lambda (x)
    ((n (m f)) x))

; Return value of `B`:
(n (m f))

Given what we know about Church numerals, we know that (n (m f)), for two numerals n and m and a function f, will return a function that expects an initial argument x and invokes f n * m times. That is, the signature actually matches the multiply version by expecting a single x argument—it’s just that it does so implicitly, while multiply does so explicitly. The point-free style can make for obscure code, but it can also put useful emphasis on core logic, as it does here by revealing multiply (written specifically for Church numerals) to be the generic B combinator (completely agnostic as to inputs and outputs).


Now, a tough one: subtraction. In a world where counting takes place through repeated function calls, addition is straightforward, but subtraction is challenging: we need to somehow execute f fewer times.

I find this easiest to reason about by starting with the simpler problem of decrementing, then building up from there. When decrementing—rather, when reversing a process of incrementing—we don’t need that much state: just the current value (after incrementing) and the previous one (before incrementing). That’s easy to express in pairs of the shape (current, previous).

So, let’s define an inner function that builds successive pairs by 1.) calling f with the first entry, and 2.) setting the second entry to the first. Assuming a hypothetical initial value of 0, and an f that simply increments, that looks like: (0, 0) -> (1, 0) -> (2, 1) -> etc.

We’ll want to land on something that conforms to the lambda calculus, but first let’s write this in vanilla Scheme, using the standard abstraction for pairing: cons. Below, pairwise is a helper for performing these (1, 0) -> (2, 1) transformations. It takes an f that operates on and returns arbitrary data, and returns a wrapped version that operates on and returns pairs. Note that pairwise only runs f on the first item in the pair it returns, such that first entry is “new” and the second “old”—and note that it only reads the first item out of the pair it receives, such that (if executed multiple times) the previous “new” becomes the next “old,” and the previous “old” is discarded.

When we first looked at Church numerals, we saw ((n f) x) invoke the function f n times with an initial value of x. Now, ((n (pairwise f)) initial-pair) runs f n times via the pairwise wrapper (adapting inputs and outputs to the pairwise data shape), and the initial value isn’t x but rather x paired with itself, just like how our incrementing sequence above begins with (0, 0). The resulting value is a pair whose first entry exactly matches ((n f) x) and whose second entry represents calling f n - 1 times. That second entry, then, is our answer to the decrementing problem, so decrement simply returns it.

(define (pairwise f)
    (lambda (pair)
        (cons (f (car pair))
              (car pair))))

(define (decrement n)
    ; Note familiar signature: returns a new Church numeral
    (lambda (f)
        (lambda (x)
            (let ((initial-pair (cons x x)))
                (let ((final-pair ((n (pairwise f)) initial-pair)))
                    (cdr final-pair))))))

Let’s check our work:

(to-number (decrement two)) ; => 1
(to-number (decrement ((add two) three))) ; => 4

At this point, we’re already most of the way to subtraction! But we’ve cut a couple corners: this implementation leans on native Scheme language features like cons and car, which aren’t part of the lambda calculus. Let’s take a sec to make this LC-compliant.

We’ll need a new abstraction for pairing, and we can find one in combinatory logic. First, let’s build a purely functional version of cons:

(define (to-pair x)
    (lambda (y)
        (lambda (accessor)
            ((accessor x) y))))

This to-pair implementation does nothing but capture closures over x and y and invoke the accessor callback—it’s not yet obvious that x and y are really a pair. To make this work, we’ll need counterparts to car and cdr:

(define (first pair)
    (pair (lambda (x)
            (lambda (y) x))))

(define (second pair)
    (pair (lambda (x)
            (lambda (y) y))))

(define pair ((to-pair 1) 2))
(first pair) ; => 1
(second pair) ; => 2

Note that both of these pass in very similar accessor callbacks, but first returns the first argument (x) and second the second (y). There’s an interesting inversion of complexity here, where to-pair is responsible for almost no logic and first/second are responsible for much more of it. Pairs as represented by to-pair are in a real sense data structures, but they’re data structures that push complexity to the consumer.

That’s all well and good, but where’s the combinatory logic? It turns out we can do this same work with two of the most fundamental combinators, K and I.

(define (K x)
    (lambda (y) x))

(define (I x) x)

K is very obviously similar to first—all that’s missing is the wrapper that calls a pair with K as the accessor. For second, it would be trivial to write an equivalent standalone accessor… but we don’t actually need to. (Recall that combinatory logic is interested in finding the minimal set of combinators to represent arbitrary computation.) Instead, we can build second out of K and the extremely fundamental I combinator: the “identity” function, which returns its argument unchanged.

(define (second pair)
    (pair (K I)))

What’s going on here? Let’s walk through the substitution logic:

  • K returns the first of its two arguments.
  • I returns its argument, like a ball bouncing off a surface.
  • (K I) returns a lambda that still expects a second argument (per K’s signature)—but this lambda will return I no matter what, again per K’s signature (((K I) x) => I).
  • But what if (K I) were called with two arguments? (((K I) x) y) reduces to (I y)—which reduces, of course, to y! So (K I) turns out to give us second (albeit in an indirect way)—which is important enough that (K I) is a named combinator in its own right, the KI combinator.

Now we have a model of pairing on the basis of just a few extremely simple functions:

(define (K x)
    (lambda (y) x))

(define (I x) x)

(define (to-pair x)
    (lambda (y)
        (lambda (accessor)
            ((accessor x) y))))

(define (first pair)
    (pair K))

(define (second pair)
    (pair (K I)))

Let’s hook these back up to an LC-compliant decrement. First, we’ll need to swap in to-pair, first, and second, which also means moving from cons’s binary signature ((cons x y)) to to-pair’s unary signature (((to-pair x) y)). In addition, let’s remove the let bindings, which aren’t part of the lambda calculus. Here’s what we’re left with:

(define (pairwise f)
    (lambda (pair)
        ((to-pair (f (first pair)))
                  (first pair))))

(define (decrement n)
    (lambda (f)
        (lambda (x)
            (second ((n (pairwise f)) ((to-pair x) x))))))

(to-number (decrement two)) ; => 1
(to-number (decrement ((add two) three))) ; => 4

We’re in the home stretch—let’s wrap this up and implement subtraction. Now that we have a decrement helper, the problem is dramatically less complicated: rather than trying to somehow invoke a function fewer times, we can go back to invoking decrement once for each unit we want to subtract. Below,

  • (m decrement) returns a function that calls decrement m times.
  • ((m decrement) n) starts from the initial value n and decrements m times, passing each result into the next decrement call.

And that’s it!

(define (subtract n)
    (lambda (m)
        ((m decrement) n)))

(to-number ((subtract three) two)) ; => 1
(to-number ((subtract ((add three) three)) one)) ; => 5

Note that minus is quite computationally expensive: to calculate ((minus ten) nine), we first build up from 0 to 10, then decrement to 9. Then we build up again from 0 to 9, decrement to 8, and so on all the way down to 1. A more efficient version would track intermediate values in a single Lisp-style list, but let’s leave that as an exercise for the reader and look at a very different part of computation: booleans.

Church booleans

The starting place for Church booleans is thinking about how booleans are used: to choose between branching code paths. A ternary, for example, begins with a boolean and returns one path or the other:

someCondition ?
    truthyReturnValue :

Church booleans are, of course, functions—but they have a similar signature. A boolean bool looks like ((bool on-true) on-false), and it simply returns one of its two arguments: the first to signify true, the second to signify false. Just like our data structure for pairs turns the data “inside out” by pushing complexity to accessor functions, these booleans are tiny structures that push complexity into the code that manipulates them.

Now, right away the signature here looks familiar. It turns out that the K/“first” combinator is precisely equivalent to true, and KI/“second” to false. As before, we can verify this with a simple helper:

(define true K)
(define false (K I))

(define (to-boolean bool)
    ((bool "true") "false"))

(to-boolean true) ; => "true"
(to-boolean false) ; => "false"

The reappearance of our pairing helpers is interesting, but here’s another surprising reappearance: recall that the Church numeral for zero is (lambda (f) (lambda (x) x)), which we now recognize as KI. Just like zero is falsey in just about every programming language, the Church encodings for zero and false are the exact same function!

But, while our helper has told us these really are booleans, we haven’t seen them behave that way. As with Church numerals and arithmetic, a more meaningful test is real-world usage: boolean logic. So, let’s implement negation, and/or operations, and equality checks.

First up, negation: we want to go from true to false and vice versa. Our true returns its first argument, and false its second, so negation entails somehow reversing that. Well, we can’t modify the internals of the original booleans—but we can call them with the on-true and on-false arguments reversed. Like this:

(define (not-verbose bool)
    (lambda (on-true) ; Standard argument order...
        (lambda (on-false) ; ...Still standard argument order...
            ((bool on-false) on-true)))) ; ...Reverse order!

As it happens, that’s another of the standard combinators: C, aka “flip,” which takes a function, takes two args, and invokes the function with those arguments reversed.

(define (C f)
    (lambda (x)
        (lambda (y)
            ((f y) x))))

So, just like we used K for true and (K I) for false, we can use C for negation.

(define not C)
(to-boolean (not true)) ; => "false"

How about and? Well, let’s assume x is true. If so, we know it’ll return its first argument. We also know that the broad problem “are both these booleans true” is now simplified to “is y also true,” such that the value of y is the value of the whole expression. So, we can pass y into x as the first argument—the one returned if x is true.

And if x is false? Well, then we no longer care about y—we have enough information to already know that the whole and expression is already false. So, we can use what we know (the value of x) and return that as the second argument. We’ve shown that, no matter the value of x, this approach works with all possible values of y—and so this incredibly minimal function is enough to implement and:

(define (and x)
    (lambda (y)
        ((x y) x)))

(to-boolean ((and true) true)) ; => "true"
(to-boolean ((and true) false)) ; => "false"
(to-boolean ((and false) true)) ; => "false"
(to-boolean ((and false) false)) ; => "false"

or is similar: Assume x is true. That’s already enough for or to return true, so in that case, x can return itself as the truthy first argument. If x is false, the only thing left to learn is whether y is true, so we can return y.

(define (or x)
    (lambda (y)
        ((x x) y)))

(to-boolean ((or true) true)) ; => "true"
(to-boolean ((or true) false)) ; => "true"
(to-boolean ((or false) true)) ; => "true"
(to-boolean ((or false) false)) ; => "false"

Finally, here’s a boolean equality check. Like and and or, it makes the most sense if we start by considering what happens if x is true. In that case, as with and, the broad question “are these two values equal” turns into the narrower question “is y also true”—so, we can simply return y.

If x is false, the answer to the whole expression is whether y is also false. If y is false (matching x), equals should return true; if y is true (not matching x), equals should return false. Thus, the return value (when x is false) is the negation of y. So, we can just return (not y)!

(define (equals x)
    (lambda (y)
        ((x y) (not y))))

These boolean operators are some of my favorite functions in computer science—to my eyes, the very definition of elegance. They laconically prefer to return their arguments rather than building new values, with something of the magician’s flair. They perform a seemingly finicky, edge-case-y set of tasks perfectly, with incredible brevity and with a guarantee of correctness provided not by intricate code but by logic.

More broadly, the lambda calculus strikes me as a sort of Rorschach test for functional programming. If you find functional patterns indirect, nonobvious, or show-off-y, the lambda calculus may strike you as the very embodiment of impractical abstraction, the worst excesses you’ve ever seen. (So many cryptic single-letter variable names!) But if, like me, you find something fascinating in the way functional programming nudges code closer to the boundary of logic, the lambda calculus is a pure distillation of that approach. It’s a compelling proof that the composition of pure, generic functions can account for a remarkable variety of problems. With its inside-out data structures, its adverbs as nouns, its profound blurring of code and data, it’s a mind-bending challenge to our intuitions about programming.

We tend to think of functional programming as very high-level, abstracted away from computer fundamentals, beautiful but occasionally impractical, up in the clouds. (The abstraction stack might look something like: Turing machines -> physical computer hardware -> the 1s and 0s of binary memory -> assembly language -> low-level languages like C -> garbage-collected languages -> functional programming.) With their higher-order functions, point-free style, and generic data, these venerable code snippets feel very abstract and somehow modern, especially with the popularity of functional patterns in the last ~10 years. So it’s fascinating to start with the impression that functional programming is on top of the abstraction stack, on the roof of computer science—and then find these techniques built into the foundation, counterparts to the Turing machine.

The way a tiny handful of functions assemble each other into pairs, into numbers, into booleans, into arithmetic and logic—it has something of the magic of DNA. Not ACGT, but SKI.

  1. We’ll also cheat a little by using a language feature that’s not part of the lambda calculus: global variable naming. This is strictly for convenience and readability; in a pure λ-calculus context, we could inline any reference to these names with the corresponding value, so we’re not gaining new capabilities here.