topics/week1.mdwn

   1 These notes will recapitulate, make more precise, and to some degree expand what we did in the last hour of our first meeting, leading up to the definitions of the `factorial` and `length` functions.
   2
   3 ### Getting started ###
   4
   5 We begin with a decidable fragment of arithmetic. Our language has some **literal values**:
   6
   7     0, 1, 2, 3, ...
   8
   9 In fact we could get by with just the literal `0` and the `succ` function, but we will make things a bit more convenient by allowing literal expressions of any natural number. We won't worry about numbers being too big for our finite computers to handle.
  10
  11 We also have some predefined functions:
  12
  13     succ, +, *, pred, -
  14
  15 Again, we might be able to get by with just `succ`, and define the others in terms of it, but we'll be a bit more relaxed. Since we want to stick with natural numbers, not the whole range of integers, we'll make `pred 0` just be `0`, and `2 - 4` also be `0`.
  16
  17 Here's another set of functions:
  18
  19     ==, <, >, <=, >=, !=
  20
  21 `==` is just what we non-programmers normally express by `=`. It's a relation that holds or not between two values. Here we'll treat it as a function that takes two values as arguments and returns a **boolean** value, that is a truth-value, as a result. The reason for using the doubled `=` symbol is that the single `=` symbol tends to get used in lots of different roles in programming, so we reserve `==` to express this meaning. I will deliberately try to minimize the uses of single `=` in this made-up language (but not eliminate it entirely), to reduce ambiguity and confusion. The `==` relation---or as we're treating it here, the `==` *function* that returns a boolean value---can at least take two numbers as arguments. Probably it makes sense for it to take other kinds of values as arguments, too. For example, it should operate on two truth-values as well. Maybe we'd want it to operate on a number and a truth-value, too? and always return false in that case? What about operating on two functions? Here we encounter the difficulty that the computer can't in general *decide* when two functions are equivalent. Let's not try to sort this all out just yet. We'll suppose that `==` can at least take two numbers as arguments, or two truth-values.
  22
  23 As mentioned in class, we represent the truth-values like this:
  24
  25     'true, 'false
  26
  27 These are instances of a broader class of literal values that I called **symbolic atoms**. We'll return to them shortly. The reason we write them with an initial `'` will also be explained shortly. For now, it's enough to note that the expression:
  28
  29     1 + 2 == 3
  30
  31 evaluates to `'true`, and the expression:
  32
  33     1 + 0 == 3
  34
  35 evaluates to `'false`. Something else that evaluates to `'false` is the simple expression:
  36
  37     'false
  38
  39 That is, literal values are a limiting case of expression, that evaluate to just themselves. More complex expressions like `1 + 0` don't evaluate to themselves, but rather down to literal values.
  40
  41 The functions `succ` and `pred` come before their arguments, like this:
  42
  43     succ 1
  44
  45 On the other hand, the functions `+`, `*`, `-`, `==`, and so on come in between their arguments, like this:
  46
  47     x < y
  48
  49 Functions of this latter sort are said to have an "infix" syntax. This is just a convenience for how we write them. Our language will have to keep rigorous track of which functions have infix syntax and which don't, but we'll just rely on context and our brains to make sense of this for now. Functions with the ordinary, non-infix syntax can take two arguments, as well. If we had defined the less-than relation (boolean function) in that style, we'd write it like this instead:
  50
  51     lessthan? (x, y)
  52
  53 or perhaps like this:
  54
  55     lessthan? x y
  56
  57 We'll get more acquainted with the difference between these next week. For now, I'll just stick to the first form.
  58
  59 Another set of operations we have are:
  60
  61     and, or, not
  62
  63 The first two of these are infix functions that expect two boolean arguments, and gives a boolean result. The third is a function that expects only one boolean argument. Our earlier function `!=` means "doesn't equal", and:
  64
  65     x != y
  66
  67 will be just another way to write:
  68
  69     not (x == y)
  70
  71 You see that you can use parentheses in the standard way. By the way, `<=` means &le; or "less than or equals to", and `>=` means &ge;. Just in case you haven't seen them written this way before.
  72
  73 I've started throwing in some **variables**. We'll say variables are any expression that's written with an initial lower-case letter, then is followed by a sequence of zero or more upper- or lower-case letters, or numerals, or underscores (`_`). Then at the end you can optionally have a `?` or `!` or a sequence of `'`s, understood as "primes." Hence, all of these are legal variables:
  74
  75     x
  76     x1
  77     x_not_y
  78     xUBERANT
  79     x'
  80     x''
  81     x?
  82     xs
  83
  84 We'll follow a *convention* of using variables with short names and a final `s` to represent collections like sequences (to be discussed below). But this is just a convention to help us remember what we're up to, not a strict rule of the language. We'll also follow a convention of only using variables ending in `?` to represent functions that return a boolean value. Thus, for example, `zero?` will be a function that expects a single number argument and returns a boolean corresponding to whether that number is `0`. `odd?` will be a function that expects a single number argument and returns a boolean corresponding to whether than number is odd. Above, I suggested we might use `lessthan?` to represent a function that expects *two* number arguments, and again returns a boolean result.
  85
  86 We also conventionally reserve variables ending in `!` for a different special class of functions, that we will explain later in the course.
  87
  88 In fact you can think of `succ` and `pred` and `not` and the rest as also being variables; it's just that these variables have been pre-defined in our language to be bound to functions we agreed upon in advance. You can even think of `==` and `<` as being variables, too, bound to other functions. But I haven't given you parsing rules yet which would make them legal variables, because they don't start with a lower-case letter. We can make the parsing rules more liberal later.
  89
  90 Only a few simple expressions in our language aren't variables. These include the literal values, and also **keywords** like `let` and `case` and so on that we'll discuss below. You can't use `let` as a variable, else the syntax of our language would become too hard to mechanically parse. (And probably too hard for our meager brains to parse, too.)
  91
  92 The rule for symbolic atoms is that a single quote `'` followed by any single word that could be a legal variable expresses such an atom, a different atom for each different expression.
  93 Thus `'false` is a symbolic atom, but so too are `'x` and `'succ`. For the time being, I'll restrict myself to only talking about the symbolic atoms `'true` and `'false`. These constitute a special subclass of symbolic atoms that we call the **booleans** or truth-values. Nothing deep hangs on them being a subclass of a larger type in this way; it just seems elegant. Some other languages make booleans their own special type, not a subclass of another type. Others make them a subclass of the numbers (yuck). We will think of them this way.
  94
  95 Note that when writing a symbolic atom there is no closing `'`, just a `'` at the beginning. That's enough to make the whole word, up to the next space (or whatever) count as expressing a symbolic atom. We use the initial `'` to make it easy for us to have a rich set of symbolic atoms, as well as a rich set of variables, without getting them mixed up. Variables never begin with `'`; symbolic atoms always do.
  96
  97 We call these things symbolic *atoms* because they aren't collections. Thus numbers are also atoms, but not symbolic ones. And functions are also atoms, but again, not symbolic ones.
  98
  99 Functions are another class of values we'll have in our language. They aren't "literal" values, though. Numbers and symbolic atoms are simple expressions in the language that evaluate to themselves. That's what we mean by calling them "literals." Functions aren't expressions in the language at all; they have to be generated from the evaluation of more complex expressions.
 100
 101 (By the way, I really am serious in thinking of *the numbers themselves* as being expressions in this language; rather than some "numerals" that aren't themselves numbers. We'll talk about this down the road. For now, don't worry about it too much.)
 102
 103 I said we wanted to be starting with a fragment of arithmetic, so we'll keep the function values off-stage for the moment, and also all the symbolic atoms except for `'true` and `'false`. So we've got numbers, truth-values, and some functions and relations (that is, boolean functions) defined on them. We also help ourselves to a notion of bounded quantification, as in &forall;`x < M.` &phi;, where `M` and &phi; are (simple or complex) expressions that evaluate to a number and a boolean, respectively. We limit ourselves to *bounded* quantification so that the fragment we're dealing with can be "effectively" or mechanically decided. (As we extend the language, we will lose that property, but it will be a topic for later discussion exactly when that happens.)
 104
 105 As I mentioned in class, I will sometimes write &forall; x : &psi; . &phi; in my informal metalanguage, where the &psi; clause represents the quantifier's *restrictor*. Other people write this like `[`&forall; x : &psi; `]` &phi;, or in various other ways. My notation is meant to parallel the notation some linguists (for example, Heim &amp; Kratzer) use in writing &lambda; x : &psi; . &phi;, where the &psi;  clause restricts the range of arguments over which the function designated by the &lambda;-expression is defined. Later we will see the colon used in a somewhat similar (but also somewhat different) way in our programming languages. But that's foreshadowing.
 106
 107
 108 ### Let and lambda ###
 109
 110 So we have bounded quantification as in &forall; `x < 10.` &phi;. Obviously we could also make sense of &forall; `x == 5.` &phi; in just the same way. This would evaluate &phi; but with the variable `x` now bound to the value `5`, ignoring whatever it may be bound to in broader contexts. I will express this idea in a more perspicuous vocabulary, like this: `let x be 5 in` &phi;. (I say `be` rather than `=` because, as I mentioned before, it's too easy for the `=` sign to get used for too many subtly different jobs.)
 111
 112 As one of you was quick to notice in class, when I shift to the `let`-vocabulary, I no longer restrict myself to just the case where &phi; evaluates to a boolean. I also permit myself expressions like this:
 113
 114     let x be 5 in x + 1
 115
 116 which evaluates to `6`. That's right. I am moving beyond the &forall; `x==5.` &phi; idea when I do this. But the rules for how to interpret this are just a straightforward generalization of our existing understanding for how to interpret bound variables. So there's nothing fundamentally novel here.
 117
 118 We can have multiple `let`-expressions embedded, as in:
 119
 120     let y be (let x be 5 in x + 1) in 2 * y
 121
 122     let x be 5 in let y be x + 1 in 2 * y
 123
 124 both of which evaluate to `12`. When we have a stack of `let`-expressions as in the second example, I will write it like this:
 125
 126     let
 127       x be 5;
 128       y be x + 1
 129     in 2 * y
 130
 131 It's okay to also write it all inline, like so: `let x be 5; y be x + 1 in 2 * y`. The `;` represents that we have a couple of `let`-bindings coming in sequence. The earlier bindings in the sequence are considered to be in effect for the later right-hand expressions in the sequence. Thus in:
 132
 133     let x be 0 in (let x be 5; y be x + 1 in 2 * y)
 134
 135 The `x + 1` that is evaluated to give the value that `y` gets bound to uses the (more local) binding of `x` to `5`, not the (previous, less local) binding of `x` to `0`. By the way, the parentheses in that displayed expression were just to focus your attention. It would have parsed and meant the same without them.
 136
 137 Now we can allow ourselves to introduce &lambda;-expressions in the following way. If a &lambda;-expression is applied to an argument, as in: `(`&lambda; `x.` &phi;`) M`, for any (simple or complex) expressions &phi; and `M`, this means the same as: `let x be M in` &phi;. That is, the argument `M` to the &lambda;-expression provides (when evaluated) a value for the variable `x` to be bound to, and then the result of the whole thing is whatever &phi; evaluates to, under that binding to `x`.
 138
 139 If we restricted ourselves to only that usage of &lambda;-expressions, that is when they were applied to all the arguments they're expecting, then we wouldn't have moved very far from the decidable fragment of arithmetic we began with.
 140
 141 However, it's tempting to help ourselves to the notion of (at least partly) *unapplied* &lambda;-expressions, too. If I can make sense of what:
 142
 143 `(`&lambda; `x. x + 1) 5`
 144
 145 means, then I can make sense of what:
 146
 147 `(`&lambda; `x. x + 1)`
 148
 149 means, too. It's just *the function* that waits for an argument and then returns the result of `x + 1` with `x` bound to that argument.
 150
 151 This does take us beyond our (first-order) fragment of arithmetic, at least if we allow the bodies and arguments of &lambda;-expressions to be any expressible value, including other &lambda;-expressions. But we're having too much fun, so why should we hold back?
 152
 153 So now we have a new kind of value our language can work with, alongside numbers and booleans. We now have function values, too. We can bind these function values to variables just like other values:
 154
 155 `let id be` &lambda; `x. x; y be id 5 in y`
 156
 157 evaluates to `5`. In reaching that result, the variable `id` was temporarily bound to the identity function, that expects an argument, binds it to the variable `x`, and then returns the result of evaluating `x` under that binding.
 158
 159 This is what is going on, behind the scenes, with all the expressions like `succ` and `+` that I said could really be understood as variables. They have just been pre-bound to certain agreed-upon functions rather than others.
 160
 161
 162 ### Containers ###
 163
 164 So far, we've only been talking about *atomic* values. Our language will also have some *container* values, that have other values as members. One example are **ordered sequences**, like:
 165
 166     [10, 20, 30]
 167
 168 This is a sequence of length 3. It's the result of *cons*ing the value `10` onto the front of the shorter, length-2 sequence `[20, 30]`. In this made-up language, we'll represent the sequence-consing operation like this:
 169
 170     10 & [20, 30]
 171
 172 If you want to know why we call it "cons", that's because this is what the operation is called in Scheme, and they call it that as shorthand for "constructing" the longer list (they call it a "list" rather than a "sequence") out of the components `10` and `[20, 30]`. The name is a bit unfortunate, though, because other structured values besides lists also get "constructed", but we don't say "cons" about them. Still, this is the tradition. Let's just take "cons" to be a nonsense label with an interesting back-history.
 173
 174 The sequence `[20, 30]` in turn is the result of:
 175
 176     20 & [30]
 177
 178 and the sequence `[30]` is the result of consing `30` onto the empty sequence `[]`. Note that the sequence `[30]` is not the same as the number `30`. The former is a container value, with one element. The latter is an atomic value, and as such won't have any elements. If you try to do this:
 179
 180     [30] + 1
 181
 182 it won't work. We haven't discussed what happens with illegal expressions like that, or like `'true + 1`. For the time being, I'll just say these "don't work", or that they "crash". We'll discuss the variety of ways these illegalities might be handled later.
 183
 184 Also, if you try to do this:
 185
 186     20 & 30
 187
 188 it won't work. The consing operator `&` always requires a container (here, a sequence) on its right-hand side. And `30` is not a container.
 189
 190 We've said that:
 191
 192     [10, 20, 30]
 193
 194 is the same as;
 195
 196     10 & (20 & (30 & []))
 197
 198 and the latter can also be written without the parentheses. Our language knows that `&` should always be understood as "implicitly associating to the right", that is, that:
 199
 200     10 & 20 & 30 & []
 201
 202 should be interpreted like the expression displayed before. Other operators like `-` should be understood as "implicitly associating to the left." If we write:
 203
 204     30 - 2 - 1
 205
 206 we presumably want it to be understood as:
 207
 208     (30 - 2) - 1
 209
 210 not as:
 211
 212     30 - (2 - 1)
 213
 214 Other operators don't implicitly associate at all. For example, you may understand the expression:
 215
 216     10 < x < 20
 217
 218 because we have familiar conventions about what it means. But what it means is not:
 219
 220     (10 < x) < 20
 221
 222 The result of the parenthesized expression is either `'true` or `'false`, assuming `x` evaluates to a number. But `'true < 20` doesn't mean anything, much less what we expect `10 < x < 20` to mean. So `<` doesn't implicitly associate to the left. Neither does it implicitly associate to the right. If you want expressions like `10 < x < 20` to be meaningful, they will need their own special rules.
 223
 224 Sequences are containers that keep track of the order of their arguments, and also those arguments' multiplicity (how many times each one appears). Other containers might also keep track of these things, and more structural properties too, or they might keep track of less. Let's say we also have **set containers** too, like this:
 225
 226     {10, 20, 30}
 227
 228 Whereas the sequences `[10, 20, 10]`, `[10, 20]`, and `[20, 10]` are three different sequences, `{10, 20, 10}`, `{10, 20}`, and `{20, 10}` would just be different ways of expressing a single set.
 229
 230 We can let the `&` operator do extra-duty, and express the "consing" relation for sets, too:
 231
 232     10 & {20}
 233
 234 evaluates to `{10, 20}`, and so too does:
 235
 236     10 & {10, 20}
 237
 238 As I mentioned in class, we'll let `&&` express the operation by which two sequences are appended or concatenated to each other:
 239
 240     [10, 20] && [30, 40, 50]
 241
 242 evaluates to `[10, 20, 30, 40, 50]`. For sets, we'll let `and` and `or` and `-` do extra duty, and express set intersection, set union, and set subtraction, when their arguments are sets. If the arguments of `and` and `or` are booleans, on the other hand, or the arguments of `-` are numbers, then they express the functions we were understanding them to express before.
 243
 244 In addition to sequences, there's another kind of expression that might initially be confused with them. We might call these **tuples** or **multivalues**. They are written surrounded by parentheses rather than square brackets. Here's an example:
 245
 246 `(0, 'true,` &lambda;`x. x)`
 247
 248 That's a multivalue or tuple with 3 elements (also called a "triple").
 249
 250 In the programming languages and other formal systems we'll be looking at, tuples and sequences are usually understood and handled differently. This is because we apply different assumptions to them. In the case of a sequence, it's assumed that they will have homogeneously-typed elements, and that their length will be irrelevant to their own type. So you can have the sequence:
 251
 252     [20, 30]
 253
 254 and the sequence:
 255
 256     [30]
 257
 258 and even the sequence:
 259
 260     []
 261
 262 and these will all be of the same type, namely a sequence of numbers. You can have sequences with other types of elements, too, for example a sequence of booleans:
 263
 264     ['true, 'false, 'true]
 265
 266 or a sequence of sequences of numbers:
 267
 268     [[10, 20], [], [30]]
 269
 270 An excellent question that came up in class is "How do we tell whether `[]` expresses the empty sequence of numbers or the empty sequence of something else?" We will discuss that question in later weeks. It's central to some of the developments we'll be exploring. For now, just put that question on a mental shelf and assume that somehow this just works out right.
 271
 272 Now whereas sequences expect homogenously-typed elements, and their length is irrelevant to their own type, mulivalues or tuples are the opposite in both respects. They may have elements of heterogenous type, as our example:
 273
 274 `(0, 'true,` &lambda;`x. x)`
 275
 276 did. They need not, but they may. Also, the type of a multivalue or tuple does depend on its length, and moreover on the specific types of each of its elements. A tuple of length 2 (also called a "pair") whose first element is a number and second element is a boolean is a different type of thing that a tuple whose first element is a boolean and whose second element is a number. Most functions expecting the first as an argument will "crash" if you give them the second instead.
 277
 278 Earlier I said that we can call these things "multivalues or tuples". Here I'll make a technical comment, that in fact I'll understand these slightly differently. Really I'll understand the bare expression `(10, x)` to express a multivalue, and to express a tuple proper, you'll have to write `Pair (10, x)` or something like that. The difference between these is that only the tuple proper is a single value that can be bound to a single variable. The multivalue isn't a single value at all, but rather a plurality of values. This is a bit subtle, and other languages we're looking at this term don't always make this distinction. But the result is that they have to say complicated things elsewhere. If we permit ourselves this fine distinction here, many other things downstream will go more smoothly than they do in the languages that don't make it. Ours is just a made-up language, but I've thought this through carefully, so humor me. We haven't yet introduced the apparatus to make sense of expressions like `Pair (10, x)`, so for the time being I'll just restrict myself to multivalues, not to tuples proper. The result will be that while we can say:
 279
 280     let x be [10, 20] in ...
 281
 282 that is, sequences are first-class values in our language, we can't say:
 283
 284     let x be (10, 'true) in ...
 285
 286 or even:
 287
 288     let x be (10, 20) in ...
 289
 290 However, intuitively it ought to make sense to say:
 291
 292     let (x, y) be (10, 'true) in ...
 293
 294 That should just bind the variable `x` to the value `10` and the variable `y` to the value `'true`, and go on to evaluate the rest of the expression with those bindings in place. In this particular example, we could equally have said:
 295
 296     let x be 10; y be 'true in ...
 297
 298 but in other examples it will be substantially more convenient to be able to bind `x` and `y` simultaneously. Here's an example:
 299
 300 `let`
 301 &nbsp;&nbsp;`f be` &lambda; `x. (x, 2*x)`
 302 &nbsp;&nbsp;`(x, y) be f 10`
 303 `in [x, y]`
 304
 305 which evaluates to `[10, 20]`. Note that we have the function `f` returning two values, rather than just one, just by having its body evaluate to a multivalue rather than to a single value.
 306
 307 It's a little bit awkward to say `let (x, y) be ...`, so I propose we instead always say `let (x, y) match ...`. (This will be even more natural as we continue generalizing what we've done here, as we will in the next section.) For consistency, we'll say `match` instead of `be` in all cases, so that we write even this:
 308
 309     let
 310       x match 10
 311     in ...
 312
 313 rather than:
 314
 315     let
 316       x be 10
 317     in ...
 318
 319
 320
 321 ### Patterns ###
 322
 323 What we just introduced is what's known in programming circles as a "pattern". Patterns can look superficially like expressions, but the context in which they appear determines that they are interpreted as patterns not as expressions. The left-hand sides of the binding lists of a `let`-expression are always patterns. Simple variables are patterns. Interestingly, literal values are also patterns. So you can say things like this:
 324
 325     let
 326       0 match 0;
 327       [] match [];
 328       'true match 'true
 329     in ...
 330
 331 (`[]` is also a literal value, like `0` and `'true`.) This isn't very useful in this example, but it will enable us to do interesting things later. So variables are patterns and literal values are patterns. Also, a multivalue of any pattern is a pattern. (Strictly speaking, it's only a multipattern, but I won't fuss about this here.) That's why we can have `(x, y)` on the left-hand side of a `let`-binding: it's a pattern, just like `x` is. Notice that `(x, 10)` is also a pattern. So we can say this:
 332
 333     let
 334       (x, 10) match (2, 10)
 335     in x
 336
 337 which evaluates to `2`. What if you did, instead:
 338
 339     let
 340       (x, 10) match (2, 100)
 341     in x
 342
 343 or, more perversely:
 344
 345     let
 346       (x, 10) match 2
 347     in x
 348
 349 Those will be pattern-matching failures. The pattern has to "fit" the value its being matched against, and that requires having the same structure, and also having the same literal values in whatever positions the pattern specifies literal values. A pattern-matching failure in a `let`-expression makes the whole expression "crash." Shortly though we'll consider `case`-expressions, which can recover from pattern-match failures in a useful way.
 350
 351 We can also allow ourselves some other kinds of complex patterns. For example, if `p` and `ps` are two patterns, then `p & ps` will also be a pattern, that can match non-empty sequences and sets. When this pattern is matched against a non-empty sequence, we take the first value in the sequence and match it against the pattern `p`; we take the rest of the sequence and match it against the pattern `ps`. (If either of those results in a pattern-matching failure, then `p & ps` fails to match too.) For example:
 352
 353     let
 354       x & xs match [10, 20, 30]
 355     in (x, xs)
 356
 357 evaluates to the multivalue `(10, [20, 30])`.
 358
 359 When the pattern `p & ps` is matched against a non-empty set, we just arbitrarily choose one value in the set and match it against the pattern `p`; and match the rest of the set, with that value removed, against the pattern `ps`. You cannot control what order the values are chosen in. Thus:
 360
 361     let
 362       x & xs match {10, 20, 30}
 363     in (x, xs)
 364
 365 might evaluate to `(20, {10, 30})` or to `(30, {10, 20})` or to `(10, {30, 20})`, or to one of these on Mondays and another on Tuesdays, and never to the third. You cannot control it or predict it. It's good style to only pattern match against sets when the final result will be the same no matter in what order the values are selected from the set.
 366
 367 A question that came up in class was whether `x + y` could also be a pattern. In this language (and most languages), no. The difference between `x & xs` and `x + y` is that `&` is a *constructor* whereas `+` is a *function*. We will be talking about this more in later weeks. For now, just take it that `&` is special. Not every way of forming a complex expression corresponds to a way of forming a complex pattern.
 368
 369 Since as we said, `x & xs` is a pattern, we can let `x1 & x2 & xs` be a pattern as well, the same as `x1 & (x2 & xs)`. And since when we're dealing with expressions, we said that:
 370
 371     [x1, x2]
 372
 373 is the same as:
 374
 375     x1 & x2 & []
 376
 377 we might as well allow this for patterns, too, so that:
 378
 379     [x1, x2]
 380
 381 is a pattern, meaning the same as `x1 & x2 & []`. Note that while `x & xs` matches *any* non-empty sequence, of length one or more, `[x1, x2]` only matches sequences of length exactly two.
 382
 383 For the time being, these are the only patterns we'll allow. But since the definition of patterns is recursive, this permits very complex patterns. What would this evaluate to:
 384
 385     let
 386       ([xs, ys], [z & zs, ws]) match ([[], [1]], [[10, 20, 30], [0]])
 387     in z & ys
 388
 389 Also, we will permit complex patterns in &lambda;-expressions, too. So you can write:
 390
 391 &lambda;`(x, y).` &phi;
 392
 393 as well as:
 394
 395 &lambda;`x.` &phi;
 396
 397 You can even write:
 398
 399 &lambda; `[x, 10].` &phi;
 400
 401 just be sure to always supply that function with arguments that are two-element sequences whose second element is `10`. If you don't, you will have a pattern-matching failure and the interpretation of your expression will "crash".
 402
 403 Thus, you can now do things like this:
 404
 405 `let`
 406 &nbsp;&nbsp;`f match` &lambda;`(x, y). (x, x + y, x + 2*y, x + 3*y);`
 407 &nbsp;&nbsp;`(a, b, c, d) match f (10, 1)`
 408 `in (b, d)`
 409
 410 which evaluates `f (10, 1)` to `(10, 11, 12, 13)`, which it matches against the complex pattern `(a, b, c, d)`, binding all four of the contained variables, and then evaluates `(b, d)` under those bindings, giving us the result `(11, 13)`.
 411
 412 Notice that in the preceding expression, the variables `a` and `c` were never used. So the values they're bound to are ignored or discarded. We're allowed to do that, but there's also a special syntax to indicate that this is what we're up to. This uses the special pattern `_`:
 413
 414 `let`
 415 &nbsp;&nbsp;`f match` &lambda;`(x, y). (x, x + y, x + 2*y, x + 3*y);`
 416 &nbsp;&nbsp;`(_, b, _, d) match f (10, 1)`
 417 `in (b, d)`
 418
 419 The role of `_` here is just to occupy a slot in the complex pattern `(_, b, _, d)`, to make it a multivalue of four values, rather than one of only two.
 420
 421 One last wrinkle. What if you tried to make a pattern like this: `[x, x]`, where some variable occurs multiple times. This is known as a "non-linear pattern". Some languages permit these (and require that the values being bound against `x` in the two positions be equal). Many languages don't permit it. Let's agree not to do this.
 422
 423
 424 ### Case and if ... then ... else ... ###
 425
 426 In class we introduced this form of complex expression:
 427
 428 `if` &phi; `then` &psi; `else` &chi;
 429
 430 Here &phi; should evaluate to a boolean, and &psi; and &chi; should evaluate to the same type. The result of the whole expression will be the same as &psi;, if &phi; evaluates to `'true`, else to the result of &chi;.
 431
 432 We said that that could be taken as shorthand for the following `case`-expression:
 433
 434 `case` &phi; `of`
 435 &nbsp;&nbsp;`'true then` &psi;`;`
 436 &nbsp;&nbsp;`'false then` &chi;
 437 `end`
 438
 439 The `case`-expression has a list of patterns and expressions. Its initial expression &phi; is evaluated and then attempted to be matched against each of the patterns in turn. When we reach a pattern that can be matched---that doesn't result in a match-failure---then we evaluate the expression after the `then`, using any variable bindings in effect from the immediately preceding match. (Any match that fails has no effect on future variable bindings. In this example, there are no variables in our patterns, so it's irrelevant.) What that right-hand expression evaluates to becomes the result of the whole `case`-expression. We don't attempt to do any further pattern-matching after finding a pattern that succeeds.
 440
 441 If a `case`-expression gets to the end of its list of patterns, and *none* of them have matched its initial expression, the result is a pattern-matching failure. So it's good style to always include a final pattern that's guaranteed to match anything. You could use a simple variable for this, or the special pattern `_`:
 442
 443     case 4 of
 444       1 then 'true;
 445       2 then 'true;
 446       x then 'false
 447     end
 448
 449     case 4 of
 450       1 then 'true;
 451       2 then 'true;
 452       _ then 'false
 453     end
 454
 455 will both evaluate to `'false`, without any pattern-matching failure.
 456
 457 There's a superficial similarity between the `let`-constructions and the `case`-constructions. Each has a list whose left-hand sides are patterns and right-hand sides are expressions. Each also has an additional expression that stands out in a special position: in `let`-expressions at the end, in `case`-expressions at the beginning. But the relations of these different elements to each other is different. In `let`-expressions, the right-hand sides of the list supply the values that get bound to the variables in the patterns on the left-hand sides. Also, each pattern in the list will get matched, unless there's a pattern-match failure before we get to it. In `case`-expressions, on the other hand, it's the initial expression that supplies the value (or multivalues) that we attempt to match against the pattern, and we stop as soon as we reach a pattern that we can successfully match against. Then the variables in that pattern are thereby bound when evaluating the corresponding right-hand side expression.
 458
 459
 460 ### Recursive let ###
 461
 462 Given all these tools, we're (almost) in a position to define functions like the `factorial` and `length` functions we defined in class.
 463
 464 Here's an attempt to define the `factorial` function:
 465
 466 `let`
 467 &nbsp;&nbsp;`factorial match` &lambda; `n. if n == 0 then 1 else n * factorial (n-1)`
 468 `in factorial`
 469
 470 or, using `case`:
 471
 472 `let`
 473 &nbsp;&nbsp;`factorial match` &lambda; `n. case n of 0 then 1; _ then n * factorial (n - 1) end`
 474 `in factorial`
 475
 476 But there's a problem here. What value does `factorial` have when evaluating the subexpression `factorial (n - 1)`?
 477
 478 As we said in class, the natural precedent for this with non-function variables would go something like this:
 479
 480     let
 481       x match 0;
 482       y match x + 1;
 483       x match x + 1;
 484       z match 2 * x
 485     in (y, z)
 486
 487 We'd expect this to evaluate to `(1, 2)`, and indeed it does. That's because the `x` in the `x + 1` on the right-hand side of the third binding (`x match x + 1`) is evaluated under the scope of the first binding, of `x` to `0`.
 488
 489 We should expect the `factorial` variable in the right-hand side of our attempted definition to behave the same way. It will evaluate to whatever value it has before reaching this `let`-expression. We actually haven't said what is the result of trying to evaluate unbound variables, as in:
 490
 491     let
 492       x match y + 0
 493     in x
 494
 495 Let's agree not to do that. We can consider such expressions only under the implied understanding that they are parts of larger expressions that assign a value to `y`, as for example in:
 496
 497     let
 498       y match 1
 499     in let
 500       x match y + 0
 501     in x
 502
 503 Hence, let's understand our attempted definition of `factorial` to be part of such a larger expression:
 504
 505 `let`
 506 &nbsp;&nbsp;`factorial match` &lambda; `n. n`
 507 `in let`
 508 &nbsp;&nbsp;`factorial match` &lambda; `n. case n of 0 then 1; _ then n * factorial (n - 1) end`
 509 `in factorial 4`
 510
 511 This would evaluate to what `4 * factorial 3` does, but with the `factorial` in the expression bound to the identity function &lambda; `n. n`. In other words, we'd get the result `12`, not the correct answer `24`.
 512
 513 For the time being, we will fix this solution by just introducing a special new construction `letrec` that works the way we want. Now in:
 514
 515 `let`
 516 &nbsp;&nbsp;`factorial match` &lambda; `n. n`
 517 `in letrec`
 518 &nbsp;&nbsp;`factorial match` &lambda; `n. case n of 0 then 1; _ then n * factorial (n - 1) end`
 519 `in factorial 4`
 520
 521 the initial binding of `factorial` to the identity function gets ignored, and the `factorial` in the right-hand side of our definition is interpreted to mean the very same function that we are hereby binding to `factorial`. Exactly how this works is a deep and exciting topic, that we will be looking at very closely in a few weeks. For the time being, let's just accept that `letrec` does what we intuitively want when defining functions recursively.
 522
 523 **It's important to make sure you say letrec when that's what you want.** You may not *always* want `letrec`, though, if you're ever re-using variables (or doing other things) that rely on the bindings occurring in a specified order. With `letrec`, all the bindings in the construction happen simultaneously. This is why you can say, as Jim did in class:
 524
 525 `letrec`
 526 &nbsp;&nbsp;`even? match` &lambda; `n. case n of 0 then 'true; _ then odd? (n-1) end`
 527 &nbsp;&nbsp;`odd? match` &lambda; `n. case n of 0 then 'false; _ then even? (n-1) end`
 528 `in (even?, odd?)`
 529
 530 Here neither the `even?` nor the `odd?` pattern is matched before the other. They, and also the `odd?` and the `even?` variables in their right-hand side expressions, are all bound at once.
 531
 532 As we said, this is deep and exciting, and it will make your head spin before we're done examining it. But let's trust `letrec` to do its job, for now.
 533
 534
 535 ### Comparing recursive-style and iterative-style definitions ###
 536
 537 Finally, we're in a position to revisit the two definitions of `length` that Jim presented in class. Here is the first:
 538
 539 `letrec`
 540 &nbsp;&nbsp;`length match` &lambda; `xs. case xs of [] then 0; _ & ys then 1 + length ys end`
 541 `in length`
 542
 543 This function accept a sequence `xs`, and if it's empty returns `0`, else it says that its length is `1` plus whatever is the length of its remainder when you take away the first element. In programming circles, this remainder is commonly called the sequence's "tail" (and the first element is its "head").
 544
 545 Thus if we evaluated `length [10, 20, 30]`, that would give the same result as `1 + length [20, 30]`, which would give the same result as `1 + (1 + length [30])`, which would give the same result as `1 + (1 + (1 + length []))`. But `length []` is `0`, so our original expression evaluates to `1 + (1 + (1 + 0))`, or `3`.
 546
 547 Here's another way to define the `length` function:
 548
 549 `letrec`
 550 &nbsp;&nbsp;`aux match` &lambda; `(n, xs). case xs of [] then n; _ & ys then aux (n + 1, ys) end`
 551 `in` &lambda; `xs. aux (0, xs)`
 552
 553 This may be a bit confusing. What we have here is a helper function `aux` (for "auxiliary") that accepts *two* arguments, the first being a counter of how long we've counted in the sequence so far, and the second argument being how much more of the sequence we have to inspect. If the sequence we have to inspect is empty, then we're finished and we can just return our counter. (Note that we don't return `0`.) If not, then we add `1` to the counter, and proceed to inspect the tail of the sequence, ignoring the sequence's first element. After the `in`, we can't just return the `aux` function, because it expects two arguments, whereas `length` should just be a function of a single argument, the sequence whose length we're inquiring about. What we do instead is return a &lambda;-generated function, that expects a single sequence argument `xs`, and then returns the result of calling `aux` with that sequence together with an initial counter of `0`.
 554
 555 So for example, if we evaluated `length [10, 20, 30]`, that would give the same result as `aux (0, [10, 20, 30])`, which would give the same result as `aux (1, [20, 30])`, which would give the same result as `aux (2, [30])`, which would give the same result as `aux(3, [])`, which would give `3`. (This should make it clear why when `aux` is called with the empty sequence, it returns the result `n` rather than `0`.)
 556
 557 Programmers will sometimes define functions in the second style because it can be evaluated more efficiently than the first style. You don't need to worry about things like efficiency in this seminar. But you should become acquainted with, and comfortable with, both styles of recursive definition.
 558
 559 It may be helpful to contrast these recursive-style definitons to the way one would more naturally define the `length` function in an imperatival language. This uses some constructs we haven't explained yet, but I trust their meaning will be intuitively clear enough.
 560
 561 `let`
 562 &nbsp;&nbsp;`empty? match` &lambda; `xs.` *this definition left as an exercise*;
 563 &nbsp;&nbsp;`tail match` &lambda; `xs.` *this definition left as an exercise*;
 564 &nbsp;&nbsp;`length match` &lambda; `xs. let`
 565 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`n := 0;`
 566 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`while not (empty? xs) do`
 567 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`n := n + 1;`
 568 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`xs := tail xs`
 569 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`end`
 570 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`in n`
 571 `in length`
 572
 573 Here there is no recursion. Rather what happens is that we *initialize* the variable `n` with the value `0`, and then so long as our sequence variable `xs` is non-empty, we *increment* that variable `n`, and *overwrite* the variable `xs` with the tail of the sequence that it is then bound to, and repeat in a loop (the `while ... do ... end` construction). This is similar to what happens in our second definition of `length`, using `aux`, but here it happens using *mutation* or *overwriting* the values of variables, and a special looping construction, whereas in the preceding definitions we achieved the same effect instead with recursion.
 574
 575 We will be looking closely at mutation later in the term. For the time being, our focus will instead be on the recursive and *immutable* style of doing things---meaning no variables get overwritten.
 576
 577 It's helpful to observe that in expressions like:
 578
 579     let
 580       x match 0;
 581       y match x + 1;
 582       x match x + 1;
 583       z match 2 * x
 584     in (y, z)
 585
 586 the variable `x` has not been *overwritten* (mutated). Rather, we have *two* variables `x` and its just that the second one is *hiding* the first so long as its scope is in effect. Once its scope expires, the original variable `x` is still in place, with its orginal binding. A different example should help clarify this. What do you think this:
 587
 588     let
 589       x match 0;
 590       (y, z) match let
 591                      x match x + 1
 592                    in (x, 2*x)
 593     in ([y, z], x)
 594
 595 evaluates to? Well, consider the right-hand side of the second binding:
 596
 597                    let
 598                      x match x + 1
 599                    in (x, 2*x)
 600
 601 This expression evaluates to `(1, 2)`, because it uses the outer binding of `x` to `0` for the right-hand side of its own binding `x match x + 1`. That gives us a new binding of `x` to `1`, which is in place when we evaluate `(x, 2*x)`. That's why the whole thing evaluates to `(1, 2)`. So now returning to the outer expression, `y` gets bound to `1` and `z` to `2`. But now what is `x` bound to in the final line,`([y, z], x)`? The binding of `x` to `1` was in place only until we got to `(x, 2*x)`. After that its scope expired, and the original binding of `x` to `0` reappears. So the final line evaluates to `([1, 2], 0)`.
 602
 603 This is very like what happens in ordinary predicate logic if you say:
 604
 605 &exist; `x. F x and (` &forall; `x. G x ) and H x`
 606
 607 The `x` in `F x` and in `H x` are governed by the outermost quantifier, and only the `x` in `G x` is governed by the inner quantifier.
 608
 609 ### That's enough ###
 610
 611 This was a lot of material, and you may need to read it carefully and think about it, but none of it should seem profoundly different from things you're already accustomed to doing. What we worked our way up to was just the kind of recursive definitions of `factorial` and `length` that you volunteered in class, before learning any programming.
 612
 613 You have all the materials you need now to do this week's [[assignment|/exercises/assignment1]]. Some of you may find it easy. Many of you will not. But if you understand what we've done here, and give it your time and attention, we believe you can do it.
 614
 615 There are also some [[advanced notes|week1 advanced notes]] extending this week's material.
 616
 617 ### Summary ###
 618
 619 Here is the hierarchy of **values** that we've talked about so far.
 620
 621 *   Multivalues
 622 *   Singular values, including:
 623     *   Atoms, including:
 624         *   Numbers: these are among the **literals**
 625         *   Symbolic atoms: these are also among the **literals**, and include:
 626             *   Booleans (or truth-values)
 627         *   Functions: these are not literals, but instead have to be generated by evaluating complex expressions
 628     *   Containers, including:
 629         *   the **literal containers** `[]` and `{}`
 630         *   Non-empty sequences, built using `&`
 631         *   Non-empty sets, built using `&`
 632         *   Tuples proper and other containers, to be introduced later
 633
 634 We've also talked about a variety of **expressions** in our language, that evaluate down to various values (if their evaluation doesn't "crash" or otherwise go awry). These include:
 635
 636 *   All of the literal atoms and literal containers
 637 *   Variables
 638 *   Complex expressions that apply `&` or some variable understood to be bound to a function to some arguments
 639 *   Various other complex expressions involving the keywords &lambda; or `let` or `letrec` or `case`
 640
 641 The special syntaxes `[10, 20, 30]` are just shorthand for the more offical syntax using `&` and `[]`, and likewise for `{10, 20, 30}`. The `if ... then ... else ...` syntax is just shorthand for a `case`-construction using the literal patterns `'true` and `'false`.
 642
 643 We also talked about **patterns**. These aren't themselves expressions, but form part of some larger expressions.
 644