More on Lists

Comprehensions

We know you are already familiar with the following kind of notation for designating sets:

{ x + 1 | x ∈ Primes and φx }

This kind of notation is called a set comprehension. Here Primes is assumed to be some, presumably larger set. φ expresses some condition that members of Primes might conceivably fail to satisfy.

Some of the functional programming languages permit you to specify data structures in this same way. Kapulet comes closest, in that it also has set comprehension notation. In Kaupulet one writes:

{ x + 1 | x from Primes, φ x }

The changes are only that we write x from Primes, with Primes being an expression that evaluates to a set, and we separate that clause from the test clause with a comma. φ x can be any expression that evaluates to a boolean. Moreover, such clauses can (syntactically) come in any order, and there can be any number of them, though the above is the most useful pattern, because the x introduced by the first clause is then bound (over the specified domain) when evaluating the clause φ x. But you can also write:

{ 1 | 'true }

which evaluates to { 1 }, or:

{ 1 | 'false }

which evaluates to the empty set { }. What if you have multiple from clauses? This is possible, and iterates over the cross-product of the two sets you're drawing from. So:

{ 10*x + y | x from {1, 2, 3}, y from {4, 5} }

evaluates to the set { 14, 15, 24, 25, 34, 35 }.

Haskell doesn't have set literals like Kapulet does, but it also allows this kind of notation with lists, that is, it has list comprehensions. (And so does Kapulet.) Thus in Haskell you can write:

[ 10*x + y | x <- [1, 2, 3], y <- [4, 5] ]

and that evaluates to [14, 15, 24, 25, 34, 35]. Notice that Haskell's syntax differs slightly. Changing the order of the from/<- clauses changes the order in which the elements will be added to the result list:

[ 10*x + y | y <- [4, 5], x <- [1, 2, 3] ]

That evaluates to [14, 24, 34, 15, 25, 35].

You can also mix in test or "guard" clauses:

[ 10*x + y | y <- [4, 5], odd y, x <- [1, 2, 3] ]

evaluates to [15, 25, 35].

Haskell also has an extension that permits you to iterate over multiple lists in parallel rather than to iterate over their cross-product. If you type :set -XParallelListComp in the ghci interpreter, that will enable this extension, and then:

[ 10*x + y | y <- [4, 5, 6] | x <- [1, 2, 3] ]

will evaluate to [14, 25, 36]. If the lists are of unequal length, Haskell stops when it exhausts the shortest. These behaviors are similar to the map2 function you defined in the week 1 homework. That also took an argument from each of several sequences in parallel. (The corresponding functions in Haskell are called zip and zipWith.)

OCaml permits lists comprehensions as an extension, and so too does Scheme, but these are a bit harder to use.

All of these things can also be expressed in these languages without using the comprehension syntax. For example, this list comprehension (in Kapulet syntax):

[ 10*x | x from [1, 2, 3, 4, 5] ]

can be expressed as:

map (lambda x. 10*x) [1, 2, 3, 4, 5]

and this:

[ 10*x | x from [1, 2, 3, 4, 5], odd? x ]

can be expressed as:

map (lambda x. 10*x) $ filter odd? [1, 2, 3, 4, 5]

(We explained the $ notation in week 1's advanced notes. This is equivalent to map (lambda x. 10*x) (filter odd? [1, 2, 3, 4, 5]).)

Iterating over the cross-product of several lists is a bit harder. Consider:

[ 10*x + y | y from [4, 5, 6], y < 6, x from [1, 2, 3] ]

To translate that, first let's handle the iteration over the final list, that x is drawn from:

map (lambda x. 10*x + y) [1, 2, 3]

This looks like what we had before, except that now we have this free variable y in our lambda expression. Perhaps we can bind that variable inside a larger lambda expression, and then map that larger lambda expression over the (filtered) list that y is drawn from:

let
  f match lambda y. map (lambda x. 10*x + y) [1, 2, 3]
in map f $ filter (lambda y. y < 6) [4, 5, 6]

This gives us nearly what we want. It evaluates to:

[[14, 24, 34], [15, 25, 35]]

Why? Because the filter expression at the end is restricting the domain that y ranges over to [4, 5]. Over this domain we are selecting a value to bind y to, and then evaluating the map expression inside f with y so bound. With y bound to 4, we get the result [14, 24, 34]. With y bound to 5, we get the result [15, 25, 35]. These two results, in order, are the elements that make up the sequence which is the result of the final, outermost map expression.

One last twist is that our original list comprehension gives us a "flatter" result. In both Kapulet (and Haskell, modulo a few syntax adjustments), the list comprehension:

[ 10*x + y | y from [4, 5, 6], y < 6, x from [1, 2, 3] ]

evaluates to:

[14, 24, 34, 15, 25, 35]

We can turn the preceding result into this result with the Kapulet function join (Haskell calls it concat or Control.Monad.join):

join [[14, 24, 34], [15, 25, 35]]

evaluates to the "flatter" list displayed above.

By the way, this join operation only affects a single layer of [ ]s. This:

join [ [[10,20], [30], []],  [[40], [50,60]] ]

evaluates to:

[[10,20], [30], [], [40], [50,60]]

not to:

[10, 20, 30, 40, 50, 60]

To get the latter, you'd need to apply join twice.

Tails

For the Lambda Calculus, we've proposed to encode lists in terms of higher-order functions that perform right-folds on (what we intuitively regard as) the real list. Thus, the list we'd write in Kapulet or Haskell as:

[a, b, c]

for some expressions a, b, and c, would be encoded in the Lambda Calculus as:

\f z. f a (f b (f c z))

With that choice of encoding, it's not difficult to write a head function. You did this for one of the week 2 homeworks. However, it is more challenging to write a tail function. Here is the intuitive idea behind one way we could do this. Our "starting value" --- what gets bound to z in the above lambda expression --- will be a pair of two values. I'll write it as ([], err) for the moment, while we fix our intuitions, rather than using the more verbose Lambda Calculus representation of pairs and []. The err will be whatever we decide should be the tail of an empty list. Perhaps it should be [], but I'll just leave it as err for this exercise. Now, when we combine the rightmost element of the list with this, by evaluating f c ([], err), we want the result to be ([c], []). That is, we throw away the second member of the pair, copy the first member over into the second slot, and cons the c onto the first member in the first slot. At the next stage, the result will be ([b, c], [c]). And at the final stage, the result will be ([a, b, c], [b, c]). Now we just have to extract the second member of this pair, and that will be the tail of our list.

If you've followed that intuitive presentation, then here is how you can write it in the lambda evaluator:

let empty = \f z. z in                   ; as before
let cons = \d ds. \f z. f d (ds f z) in  ; as before
let pair = \x y. \f. f x y in
let snd = \x y. y in
let shift = \h p. p (\x y. pair (cons h x) x) in
let tail = \xs. (xs shift (pair empty err)) snd in
...

Here shift is our fold-function, and takes two arguments, the current list element h and the pair p that we've built up from the starting value by folding over the more rightward portion of the list, if any. The shift function binds the two members of the pair to x and y, disregarding the second. It returns a new pair whose first member is cons h x and whose second member is x. Our starting value is pair empty err. And at the end of our fold we're left with a pair, and want to extract its second member; that's why tail is of the form \xs. (xs shift ...) snd.

Try it out in the lambda evaluator. After the code above, you can write:

...
let abc = cons a (cons b (cons c empty)) in  ; encoding of [a, b, c]
tail abc

and the result will be \f z. f b (f c z), our encoding of [b, c].

A possible refinement on our right-fold encoding of lists would be to keep track of the tails as the lists are constructed, and feed them to the folding functions f as an additional argument. We've been encoding [a, b, c] like this:

\f z. f a (f b (f c z))

or in other words:

\f z. f a MORE

where MORE is the result of folding f and z over [a, b, c]'s tail. Instead, we might encode [a, b, c] like this:

\g z. g a [b,c] (g b [c] (g c [] z))

or in other words:

\g z. g a [b,c] MORE

The difference is that this new encoding uses a fold function g expecting three arguments, and the encoding not only passes the current head as a first argument to that function (as before), but also passes the current tail as a new middle argument to that function. As before, we continue to pass the result of the fold applied to the tail as the last argument.

With this encoding scheme, each list would be represented by a somewhat more complex function than before. On the other hand, it would now become as easy to query the list's tail as it is to query its head. Before the refinement, querying the tail of a list required us to build up the tail afresh each time we wanted to extract it. Whereas this refinement saves a copy of the tail for direct access, as a new middle argument to the fold function, when the list is first constructed.

(Oleg discusses making a parallel refinement in the encoding of numbers here.)

Other Encoding Strategies for Lists

If you're keeping track, we've now seen three different encoding strategies for lists: in terms of their right-folds, their left-folds, and the somewhat more complex right-fold encoding just discussed.

Let's consider one or two more, that take different, apparently simpler strategies, avoiding the use of folds in the encoding.

Isn't the simplest approach just to represent a non-empty list as an ordered pair of its head and its tail? In the first week, we mentioned that the fancier functional programming languages like OCaml and Haskell sharply distinguish between lists and tuples. Lists had to be type-homogenous, and their type was insensitive to their length. Tuples on the other hand could be (though needn't be) type-heterogenous, and their own types were determined by the number (and order) of the types of their elements. But we don't have types in the Lambda Calculus. (At least, we don't have a variety of types.) So maybe in this context we can identify lists with certain tuples, without getting in trouble.

(In fact, as we discuss elsewhere, this is how Scheme also implements its lists.)

We'll still need to choose some lambda term to encode the empty list. It's not straightaway obvious what that should be. What else will need for our lists? Well, let's think about what we're going to do with them. One thing we did with the lists again and again in the functional programming languages was pattern-match on them. What that amounted to was: we checked if the list was empty, and if so we took one program branch, and if not we bound some variables to the list's head and to its tail, and then took a second program branch. We already know how to take an ordered pair apart in that way. What we don't have yet is is a way to tell whether an arbitrary list is empty.

One idea would be to represent the non-empty lists as a triple instead of a pair, with their first element being the boolean false, announcing that they aren't empty, then their second element being their head, and their third element being their tail. Then the empty list could be a triple whose first element is true. It's not obvious yet what its second and third elements should be.

The conventional implementation of this idea does essentially this, only instead of triples, they use pairs, whose first element is the same boolean flag just described, and whose second element in the case of non-empty lists are another pair holding the list's head and its tail. So the list [a, b, c] would be encoded as:

(false, (a, [b,c]))

where (x, y) represents the Lambda Calculus encoding of an ordered tuple (we're already familiar with this, \f. f x y), and [b,c] represents the tail of [a, b, c], encoded by this same scheme, namely:

(false, (b, [c]))

Here [c] represents the tail of [b, c] encoded by this same scheme, namely:

(false, (c, []))

Here [] represents the empty list, which we'll encode as:

(true, ???)

What should stand in for ??? No particular choice seems forced here. One strategy would be to go ahead and build your family of list operations, and see whether any particular choice for that slot in the empty list made some of the other operations easier to define or more elegant. Here's an example. We shouldn't expect the result of querying the head of [] to be meaningful; but what about querying its tail? Perhaps that should also be meaningless. Or you could make a case that [] is most naturally thought to be its own tail. If we went that way, it would be nice to let ??? in our encoding of [] be some value, such that, when we tried to extract the empty list's tail using the same operation that worked straightforwardly and naturally for non-empty lists, we got back the empty list itself. In fact it is possible to do this. (But it requires a fixed-point combinator, which we won't be discussing until next week.)

For the time being, let's not worry about what stands in for ??? in our encoding of []. Let's get clear instead on what our other basic list operations should look like.

Building a list from element x and an existing list xs isn't hard: we just build a pair whose first element is false and whose second element is the pair (x, xs):

cons ≡ \x xs. \f. f false (\g. g x xs)

Determining whether a list is empty is just a matter of extracting the first element of the outer pair:

empty? ≡ \xs. xs (\a b. a)

Given a non-empty list, extracting its head is just a matter of extracting the first element of the inner pair that is its second element:

head ≡ \xs. xs (\a b. b (\c d. c))

and so on.

Now think about how you'd define recursive operations like length or map, or any of the other operations we've recently been expressing as special cases of folds. It's not at all obvious how to do this. With a list like [a, b, c], my implementation of the length function could see that the list isn't empty, and so the length has to be at least 1, and it's easy enough to extract the tail [b, c] of this list. But then I am going to want to have the length function call itself recursively on that tail. And we don't know yet how to do that. We don't know how to get the effect of letrec inside the Lambda Calculus. We don't know yet how to have any lambda terms that refer to themselves.

This is in fact a formidable obstacle. The present encoding of lists makes some things easier than our (original) right-fold encoding of lists: it's easier to extract the tail, plus the whole system just seems simpler. But the compensating disadvantage is that we don't know how to perform recursive operations on the lists so encoded. At least, not until we work out a general strategy for expressing letrec in the Lambda Calculus. With the list encodings we looked at earlier, that "baked" the fold operation into the list's very construction, we didn't need any such general-purpose letrec. The natural recursive operations we wanted to perform on lists were already in our reach.

Let's consider one more encoding strategy for lists. This will have the same serious shortcoming as the simple encoding we just considered: we won't be able to do recursive operations with it until we have a general-purpose letrec. But in other respects it may be improvement on that encoding. That encoding might seem a bit ad hoc. Plus there's that matter of the ??? in our construction of [], where we don't know what it should be. If we proceed a bit differently, it will be easier to see some systematic rationale for our choices.

We've already seen some enumerations. These are "data structures" that consist of a fixed, finite number of discrete values. Such as true and false. Sometimes enumerations are understood to have a meaningful intrinsic order, but that's not important for our purposes here.

We've already seen how to encode data structures like this. We encode true as:

\y n. y

The idea is that the data structure is represented as a function. We pass that function several arguments. One of the arguments, bound to y, will capture what we want if we've got one variant (the true variant) of the data structure, and the other argument, bound to n, will capture what we want if we've got the other variant (the false variant). Our encoding of true just chooses the argument y that's understood to be associated with it, and returns that one, discarding the n.

We could easily extend this strategy to data structures that have more than two variants. For instance, suppose I want a system of colors, where the choices are only red, green, or blue. I might then encode red in the Lambda Calculus as:

\r g b. r

The idea is that if some program has ahold of some as-yet-unknown rgb-color, it can pass that color three arguments, and if the color is red, the result will be the first of those passed arguments. Of course the order of which argument goes with red, which with green, and which with blue, is entirely conventional --- just as it was with the booleans. We only need to stay consistent.

Another idea we've seen is to have data structures that have parameters. They are a kind of container that can hold one (or more) other values. Lists are a more complex example of this than I want to focus on yet. The simpler examples are just our ordered pairs, and triples, and so on. An ordered triple (a, b, c) is encoded as:

\f. f a b c

The idea here is that if some program has ahold of some as-yet-unknown triple, it can pass that triple a single function argument f, and the triple will apply that function argument to its members.

This represents a second kind of building block for data structures. We can combine these two building blocks. We can have data structures that have one or more variants, each of which variant has 0 or more parameters.

The booleans represent the case of two variants, each having 0 parameters or members. The rgb-colors we just sketched give us a third variant, but still each variant has 0 parameters or members. The triples have only a single variant, but in this case each instance of that variant has 3 parameters. We can combine these building blocks in more complex ways. For instance, let's define a different kind of color data structure. This one will have four variants: cyan, magenta, yellow, and gray. Each of the first three variants takes 0 parameters. But the gray variant takes two parameters. (Suppose one is a natural number representing amount of brightness, with 0 being black, and the other is a natural number representing amount of glossiness. I realize these aren't the most realistic examples. Play along for pedagogy.) Then we'd encode cyan like this:

\c m y g. c

Similarly for magenta and yellow. We'd encode an instance of gray whose two parameters are a and b like this:

\c m y g. g a b

The idea here is that if some program has ahold of some as-yet-unknown cmyg-color, it can pass that color four arguments, the last of which must be a curried function accepting (at least) two arguments. In the case that the color is an instance of gray, it will apply that function argument to its brightness and its glossiness parameters.

I hope this all seems natural and systematic. When we get to discussing types, you'll see that the strategy deployed here has great generality. (You may even see the encoding strategy discussed above, in terms of pairs of pairs, as an approximate implementation of it.)

Now, what about lists? A list is basically just like our cmyg-colors, with just a few slight changes. First, with lists there is only one variant that has no parameters, namely []. So that takes the place of one of cyan, magenta, and yellow, and we don't need the other two. Second, with lists the remaining variant also takes two parameters, but the first parameter needn't necessarily be a number. We haven't said what the heads of lists can be. Or rather, we haven't done anything to preclude you from consing any lambda term onto an existing list. Third, the second parameter of a non-empty list is also understood to not be a number, but rather to be another list --- perhaps another non-empty list, but eventually it would have to be the empty list. (Else what you've got is not understood to be an encoded list at all, but rather just some possibly garbled lambda term that only resembles a list. Consider Scheme's notion of "improper lists," that we discuss elsewhere.)

That last modification would be like, if in our encoding of cmyg-colors, the second parameter of the gray variant weren't a number, but rather another cmyg-color. Perhaps people sometimes paint multiple layers of gray paint (of possibly varying brightness) on top of already-painted walls. The base coat is always cyan, magenta, or yellow. On top of that is some finite number of different coats of gray. Each layer of gray paint keeps track of its own brightness, and what layer of paint lies directly beneath it. That would be a structure more like what we have with lists.

Here are some proposed list definitions based on these ideas:

[]     ≡ \f n. n
[b]    ≡ \f n. f b []   ≡ \f n. f b (\f n. n)
[a, b] ≡ \f n. f a [b]  ≡ \f n. f a (\f n. f b (\f n. n))
cons   ≡ \x xs. \f n. f x xs
empty? ≡ \xs. xs (\y ys. false) true
tail   ≡ \xs. xs (\y ys. ys) err

Or perhaps we should make tail return [] when applied to [], rather than err.

As we said, with this encoding system too, we'd still need some general strategy for expressing letrec, before we'd be able to define functions like length and map and so on.