From: Jim <jim.pryor@nyu.edu>
Date: Sun, 1 Feb 2015 07:37:56 +0000 (-0500)
Subject: update week1 notes
X-Git-Url: http://lambda.jimpryor.net/git/gitweb.cgi?p=lambda.git;a=commitdiff_plain;h=2e3825511fb9fb97a8a298b43ee1f2f24ec4017b

update week1 notes
---

diff --git a/week1.mdwn b/week1.mdwn
index c9aee3bc..1f500367 100644
--- a/week1.mdwn
+++ b/week1.mdwn
@@ -153,7 +153,7 @@ So now we have a new kind of value our language can work with, alongside numbers
 
 `let id be` &lambda; `x. x; y be id 5 in y`
 
-will evaluate to `5`. In reaching that result, the variable `id` was temporarily bound to the identity function, that expects an argument, binds it to the variable `x`, and then returns the result of evaluating `x` under that binding.
+evaluates to `5`. In reaching that result, the variable `id` was temporarily bound to the identity function, that expects an argument, binds it to the variable `x`, and then returns the result of evaluating `x` under that binding.
 
 This is what is going on, behind the scenes, with all the expressions like `succ` and `+` that I said could really be understood as variables. They have just been pre-bound to certain agreed-upon functions rather than others.
 
@@ -230,7 +230,7 @@ We can let the `&` operator do extra-duty, and express the "consing" relation fo
 
     10 & {20}
 
-would evaluate to `{10, 20}`, and so too would:
+evaluates to `{10, 20}`, and so too does:
 
     10 & {10, 20}
 
@@ -238,7 +238,7 @@ As I mentioned in class, we'll let `&&` express the operation by which two seque
 
     [10, 20] && [30, 40, 50]
 
-will evaluate to `[10, 20, 30, 40, 50]`. For sets, we'll let `and` and `or` and `-` do extra duty, and express set intersection, set union, and set subtraction, when their arguments are sets. If the arguments of `and` and `or` are booleans, on the other hand, or the arguments of `-` are numbers, then they express the functions we were understanding them to express before.
+evaluates to `[10, 20, 30, 40, 50]`. For sets, we'll let `and` and `or` and `-` do extra duty, and express set intersection, set union, and set subtraction, when their arguments are sets. If the arguments of `and` and `or` are booleans, on the other hand, or the arguments of `-` are numbers, then they express the functions we were understanding them to express before.
 
 In addition to sequences, there's another kind of expression that might initially be confused with them. We might call these **tuples** or **multivalues**. They are written surrounded by parentheses rather than square brackets. Here's an example:
 
@@ -301,7 +301,7 @@ but in other examples it will be substantially more convenient to be able to bin
 &nbsp;&nbsp;`(x, y) be f 10`  
 `in [x, y]`
 
-which will evaluate to `[10, 20]`. Note that we have the function `f` returning two values, rather than just one, just by having its body evaluate to a multivalue rather than to a single value.
+which evaluates to `[10, 20]`. Note that we have the function `f` returning two values, rather than just one, just by having its body evaluate to a multivalue rather than to a single value.
 
 It's a little bit awkward to say `let (x, y) be ...`, so I propose we instead always say `let (x, y) match ...`. (This will be even more natural as we continue generalizing what we've done here, as we will in the next section.) For consistency, we'll say `match` instead of `be` in all cases, so that we write even this:
 
@@ -327,13 +327,13 @@ What we just introduced is what's known in programming circles as a "pattern". P
       'true match 'true
     in ...
 
-(`[]` is also a literal value, like `0` and `'true`.) This isn't very useful in this example, but it will enable us to do interesting things later. So variables are patterns and literal values are patterns. Also, a multivalue of any pattern is a pattern. That's why we can have `(x, y)` on the left-hand side of a `let`-binding: it's a pattern, just like `x` is. Notice that `(x, 10)` is also a pattern. So we can say this:
+(`[]` is also a literal value, like `0` and `'true`.) This isn't very useful in this example, but it will enable us to do interesting things later. So variables are patterns and literal values are patterns. Also, a multivalue of any pattern is a pattern. (Strictly speaking, it's only a multipattern, but I won't fuss about this here.) That's why we can have `(x, y)` on the left-hand side of a `let`-binding: it's a pattern, just like `x` is. Notice that `(x, 10)` is also a pattern. So we can say this:
 
     let
       (x, 10) match (2, 10)
     in x
 
-which will evaluate to `2`. What if you did, instead:
+which evaluates to `2`. What if you did, instead:
 
     let
       (x, 10) match (2, 100)
@@ -353,7 +353,7 @@ We can also allow ourselves some other kinds of complex patterns. For example, i
       x & xs match [10, 20, 30]
     in (x, xs)
 
-will evaluate to the multivalue `(10, [20, 30])`.
+evaluates to the multivalue `(10, [20, 30])`.
 
 When the pattern `p & ps` is matched against a non-empty set, we just arbitrarily choose one value in the set match it against the pattern `p`; and match the rest of the set, with that value removed, against the pattern `ps`. You cannot control what order the values are chosen in. Thus:
 
@@ -406,7 +406,7 @@ Thus, you can now do things like this:
 &nbsp;&nbsp;`(a, b, c, d) match f (10, 1)`  
 `in (b, d)`
 
-which will evaluate `f (10, 1)` to `(10, 11, 12, 13)`, which it will match against the complex pattern `(a, b, c, d)`, binding all four of the contained variables, and then evaluate `(b, d)` under those bindings, giving us the result `(11, 13)`.
+which evaluates `f (10, 1)` to `(10, 11, 12, 13)`, which it will match against the complex pattern `(a, b, c, d)`, binding all four of the contained variables, and then evaluate `(b, d)` under those bindings, giving us the result `(11, 13)`.
 
 Notice that in the preceding expression, the variables `a` and `c` were never used. We're allowed to do that, but there's also a special syntax to indicate that we want to throw away a value like this. We use the special pattern `_`:
 
@@ -421,14 +421,137 @@ One last wrinkle. What if you tried to make a pattern like this: `[x, x]`, where
 
 ### Case and if/then/else ###
 
-*More coming*
+In class we introduced this form of complex expression:
+
+`if` &phi; `then` &psi; `else` &chi;
+
+Here &phi; should evaluate to a boolean, and &psi; and &chi; should evaluate to the same type. The result of the whole expression will be the same as &psi;, if &phi; evaluates to `'true`, else to the result of &chi;.
+
+We said that that could be taken as shorthand for the following `case`-expression:
+
+`case` &phi; `of`  
+&nbsp;&nbsp;`'true then` &psi;  
+&nbsp;&nbsp;`'false then` &chi;  
+`end`
+
+The `case`-expression has a list of patterns and expressions. Its initial expression &phi; is evaluated and then attempted to be matched against each of the patterns in turn. When we reach a pattern that can be matched---that doesn't result in a match-failure---then we evaluate the expression after the `then`, using the variable bindings in effect from the immediately preceding match. (Any match that fails has no effect on future variable bindings.) That is the result of the whole `case`-expression; we don't attempt to do any further pattern-matching after finding a pattern that succeeds.
+
+If a `case`-expression gets to the end of its list of patterns, and *none* of them have matched its initial expression, the result is a pattern-matching failure. So it's good style to always include a final pattern that's guaranteed to match anything. You could use a simple variable for this, or the special pattern `_`:
+
+    case 4 of
+      1 then 'true;
+      2 then 'true;
+      x then 'false
+    end
+
+    case 4 of
+      1 then 'true;
+      2 then 'true;
+      _ then 'false
+    end
+
+will both evaluate to `'false`, without any pattern-matching failure.
+
+There's a superficial similarity between the `let`-constructions and the `case`-constructions. Each has a list whose left-hand sides are patterns and right-hand sides are expressions. Each also has an additional expression that stands out in a special position: in `let`-expressions at the end, in `case`-expressions at the beginning. But the relations of these different elements to each other is different. In `let`-expressions, the right-hand sides of the list supply the values that get bound to the variables in the patterns on the left-hand sides. Also, each pattern in the list will get matched, unless there's a pattern-match failure before we get to it. In `case`-expressions, on the other hand, it's the initial expression that supplies the value (or multivalues) that we attempt to match against the pattern, and we stop as soon as we reach a pattern that we can successfully match against. Then the variables in that pattern are bound when evaluating the right-hand side expressions.
+
 
 ### Recursive let ###
 
-*More coming*
+Given all these tools, we're (almost) in a position to define functions like the `factorial` and `length` functions we defined in class.
+
+Here's an attempt to define the `factorial` function:
+
+`let`  
+&nbsp;&nbsp;`factorial match` &lambda; `n. if n == 0 then 1 else n * factorial (n-1)`  
+`in factorial`
+
+or, using `case`:
+
+`let`  
+&nbsp;&nbsp;`factorial match` &lambda; `n. case n of 0 then 1; _ then n * factorial (n - 1) end`  
+`in factorial`
+
+But there's a problem here. What value does `factorial` have when evaluating the expression `factorial (n - 1)`?
+
+As we said in class, the natural precedent for this with non-function variables would go something like this:
+
+    let
+      x match 0;
+      y match x + 1;
+      x match x + 1;
+      z match 2 * x
+    in (y, z)
+
+We'd expect this to evaluate to `(1, 2)`, and indeed it does. That's because the `x` in the `x + 1` on the right-hand side of the third binding (`x match x + 1`) is evaluated under the scope of the first binding, of `x` to `0`.
+
+We should expect the `factorial` variable in the right-hand side of our attempted definition to behave the same way. It will evaluate to whatever value it has before reaching this `let`-expression. We actually haven't said what is the result of trying to evaluate unbound variables, as in:
+
+    let
+      x match y + 0
+    in x
+
+Let's agree not to do that. We can consider such expressions only under the implied understanding that they are parts of larger expressions that assign a value to `y`, as for example in:
+
+    let
+      y match 1
+    in let
+      x match y + 0
+    in x
+
+Hence, let's understand our attempted definition of `factorial` to be part of such a larger expression:
+
+`let`  
+&nbsp;&nbsp;`factorial match` &lambda; `n. n`  
+`in let`  
+&nbsp;&nbsp;`factorial match` &lambda; `n. case n of 0 then 1; _ then n * factorial (n - 1) end`  
+`in factorial 4`
+
+This would evaluate to what `4 * factorial 3` does, but with the `factorial` in the expression bound to the identity function &lambda; `n. n`. In other words, we'd get the result `12`, not the correct answer `24`.
+
+For the time being, we will fix this solution by just introducing a special new construction `letrec` that works the way we want. Now in:
+
+`let`  
+&nbsp;&nbsp;`factorial match` &lambda; `n. n`  
+`in letrec`  
+&nbsp;&nbsp;`factorial match` &lambda; `n. case n of 0 then 1; _ then n * factorial (n - 1) end`  
+`in factorial 4`
+
+the initial binding of `factorial` gets ignored, and the `factorial` in the right-hand side of our definition is interpreted to mean the very same function that we are hereby binding to `factorial`. Exactly how this works is a deep and exciting topic, that we will be looking at very closely in a few weeks. For the time being, let's just accept that `letrec` does what we intuitively want when defining functions recursively.
+
+**It's important to make sure you say letrec when that's what you want.** You may not *always* want `letrec`, though, if you're ever re-using variables (or doing other things) that rely on the bindings occurring in a specified order. With `letrec`, all the bindings in the construction happen simultaneously. This is why you can say, as Jim did in class:
+
+`letrec`  
+&nbsp;&nbsp;`even? match` &lambda; `n. case n of 0 then 'true; _ then odd? (n-1) end`  
+&nbsp;&nbsp;`odd? match` &lambda; `n. case n of 0 then 'false; _ then even? (n-1) end`  
+`in (even?, odd?)`
+
+Here neither the `even?` nor the `odd?` pattern is matched before the other. They, and also the `odd?` and the `even?` variables in their right-hand side expressions, are all bound at once.
+
+As we said, this is deep and exciting, and it will make your head spin before we're done examining it. But let's trust `letrec` to do its job, for now.
+
 
 ### Comparing recursive-style and iterative-style definitions ###
 
-*More coming*
+Finally, we're in a position to revisit the two definitions of `length` that Jim presented in class. Here is the first:
+
+`letrec`  
+&nbsp;&nbsp;`length match` &lambda; `xs. case xs of [] then 0; _:ys then 1 + length ys end`  
+`in length`
+
+This function accept a sequence `xs`, and if its empty returns `0`, else it says that its length is `1` plus whatever is the length of its remainder when you take away the first element. In programming circles, this remainder is commonly called the sequence's "tail" (and the first element is its "head").
+
+Thus if we evaluated `length [10, 20, 30]`, that would give the same result as `1 + length [20, 30]`, which would give the same result as `1 + (1 + length [30])`, which would give the same result as `1 + (1 + (1 + length []))`. But `length []` is `0`, so our original expression evaluates to `1 + (1 + (1 + 0))`, or `3`.
+
+Here's another way to define the `length` function:
+
+`letrec`  
+&nbsp;&nbsp;`aux match` &lambda; `(n, xs). case xs of [] then n; _:ys then aux (n + 1, ys) end`  
+`in` &lambda; `xs. aux (0, xs)`
+
+This may be a bit confusing. What we have here is a helper function `aux` (for "auxiliary") that accepts *two* arguments, the first being a counter of how long we've counted in the sequence so far, and the second argument being how much more of the sequence we have to inspect. If the sequence we have to inspect is empty, then we're finished and we can just return out counter. (Note that we don't return `0`.) If not, then we add `1` to the counter, and proceed to inspect the tail of the sequence, ignoring the sequence's first element. After the `in`, we can't just return the `aux` function, because it expects two arguments, whereas `length` should just be a function of a single argument, the sequence whose length we're inquiring about. What we do instead is return a &lambda;-generated function, that expects a single sequence argument `xs`, and then returns the result of calling `aux` with that sequence together with an initial counter of `0`.
+
+So for example, if we evaluated `length [10, 20, 30]`, that would give the same result as `aux (0, [10, 20, 30])`, which would give the same result as `aux (1, [20, 30])`, which would give the same result as `aux (2, [30])`, which would give the same result as `aux(3, [])`, which would give `3`. (This should make it clear why when `aux` is called with the empty sequence, it returns the result `n` rather than `0`.)
+
+Programmers will sometimes define functions in the second style because it can be evaluated more efficiently than the first style. You don't need to worry about things like efficiency in this seminar. But you should become acquainted with, and comfortable with, both styles of recursive definition.