topics/week1_advanced_notes.mdwn

   1 These are some advanced notes extending the material presented this week. Don't worry about mastering this stuff now if you're already feeling saturated. But you will need to learn these things in the near future, so tackle them as soon as you're ready.
   2
   3 [[!toc]]
   4
   5 ### More on multivalues ###
   6
   7 A multivalue is a series of *zero or more* values. They are the result of evaluating expressions that consist of *zero or more* expressions, separated by commas, and enclosed in parentheses. So these expressions evaluate to multivalues:
   8
   9 `(10, x + 2, x > 0)`
  10 `(0,` &lambda;`x. x)`
  11
  12 but so too do these:
  13
  14     (10)
  15     ()
  16
  17 When there's only a single expression, the surrounding parentheses are optional---though you may want them anyway for grouping, as in:
  18
  19     30 - (2 - 1)
  20
  21 In order to supply the (length-one) multivalue `1` as the second argument to the `30 - ...` expression, you have to surround `2 - 1` by parentheses. If you just wrote:
  22
  23     30 - 2 - 1
  24
  25 it would be understood instead as:
  26
  27     (30 - 2) - 1
  28
  29 because the operator `-` implicitly associates to the left.
  30
  31 So officially I count ordinary single values as a special case of multivalue. Sometimes, though, I suspect I may say "multivalue" when what I mean is "multivalue that isn't a single value." If you catch me doing it, feel welcome to correct me.
  32
  33 We'll talk more about the length-zero multivalues later in the seminar.
  34
  35
  36 ### Comments ###
  37
  38 Programming languages use lots of conventions for comments. Some languages have two kinds of comments: one sort starts at some special character or characters and extends to the end of the line, the other sort starts with some special opening characters and continues on, possibly for many lines, until reaching the special closing characters. Many Scheme implementations actually have *three* different syntaxes for comments.
  39
  40 In our toy language, we will use `#` until the end of the line for comments. This is also the convention in "shell scripts" and Python and a number of other languages.
  41
  42 In Scheme, one of the syntaxes for comments is that `;` until the end of the line is a comment.
  43
  44 In Haskell, there are two syntaxes for comments. One goes from `--` until the end of the line. The other starts with `{-` and continues until a matching `-}` is reached.
  45
  46 In OCaml, there are no until-the-end-of-the-line comments. The only comments start with `(*` and continue until a matching `*)` is reached.
  47
  48 I agree it's annoying that these conventions are so diverse. There are plenty other commenting conventions out there, too.
  49
  50
  51 ### Matching function values ###
  52
  53 A function value doesn't have any structure---at least none that's visible to the pattern-matching system. You can only match against simple patterns like `_` or the variable `f`.
  54
  55 When matching a &lambda;-generated function value against a variable in a `let`- or `letrec`-construction, there's an alternative syntax that you may find more convenient. This:
  56
  57 `let`
  58 &nbsp;&nbsp;`f match` &lambda;`x.` &phi;`;`
  59 &nbsp;&nbsp;`g match` &lambda;`(x, y).` &chi;
  60 `in ...`
  61
  62 can equivalently be written like this:
  63
  64 `let`
  65 &nbsp;&nbsp;`f x =` &phi;`;`
  66 &nbsp;&nbsp;`g (x, y) =` &chi;
  67 `in ...`
  68
  69 This is one of the few places where I'll indulge in the use of single `=`. Note that the left-hand sides of these bindings aren't function applications. They are a special syntax, that is interpreted exactly the same as the original expression displayed above.
  70
  71 This special syntax is only permitted in `let`- and `letrec`-constructions, not in `case`-constructions.
  72
  73
  74 <a id=guards></a>
  75 ### Pattern guards ###
  76
  77 In `case` contructions, it's sometimes useful to check not only whether a certain pattern matches, but also whether a certain boolean expression is true, typically where we want some variables in that expression to be bound by the relevant pattern. Thus, for example, if we wanted to count the number of odd numbers in a sequence, we could do this:
  78
  79     letrec
  80       count_odd xs = case xs of
  81                        [] then 0;
  82                        y & ys then case odd? y of
  83                                      'true then 1 + count_odd ys;
  84                                      'false then    count_odd ys
  85                                    end
  86                      end
  87     in count_odd
  88
  89 It's a bit cumbersome, though, to have the doubly-embedded `case`-constructions. We could simplify it a bit by replacing the inner `case` with and `if ... then ... else ...`, but that would still be a deeper structure than we might like. **Pattern guards** are an extra bit of syntax to make this easier to write. Using pattern guards, the previous expression could be written as:
  90
  91     letrec
  92       count_odd xs = case xs of
  93                        [] then 0;
  94                        y & ys when odd? y then 1 + count_odd ys;
  95                        _ & ys             then count_odd ys
  96                      end
  97     in count_odd
  98
  99 If we get to the `y & ys` line in the pattern list, and the pattern-match succeeds, then we check the guard expression `odd? y`, with `y` bound to whatever part of `xs` matched the corresponding part of the pattern `y & ys`. If that boolean expression is `'true`, then we continue to the right-hand side, after the `then`, just as usual. But if the boolean expression is `'false`, then we treat the whole line as a failed match, and proceed on to the next line in the binding list, if any.
 100
 101
 102 <a id=as-patterns></a>
 103 ### As-patterns ###
 104
 105 Sometimes it's useful to bind variables against overlapping parts of a structure. For instance, suppose I'm writing a pattern that is to be matched against multivalues like `([10, 20], 'true)`. And suppose I want to end up with `ys` bound to `[10, 20]`, `x` bound to `10`, and `xs` bound to `[20]`. Using the techniques introduced so far, I have two options. First, I could bind `ys` against `[10, 20]`, and then initiate a second pattern-match to break that up into `10` and `[20]`. Like this:
 106
 107     case ([10, 20], 'true) of
 108       (ys, _) then case ys of
 109                      x & xs then ...;
 110                      ...
 111                    end;
 112       ...
 113     end
 114
 115 Alternatively, I could directly bind `x` against `10` and `xs` against `[20]`. But then I would have to re-cons them together again to get `ys`. Like this:
 116
 117     case ([10, 20], 'true) of
 118       (x & xs, _) then let
 119                          ys match x & xs
 120                        in ...;
 121       ...
 122     end
 123
 124 Both of these strategies work. But they are a bit inefficient. I said you didn't really need to worry about efficiency in this seminar. But these are also a bit cumbersome to write. There's a special syntax that enables us to bind all three of `ys`, `x`, and `xs` in the desired way, despite the fact that they will be matching against overlapping, rather than discrete, parts of the value `[10, 20]`. The special syntax looks like this:
 125
 126     case ([10, 20], 'true) of
 127       ((x & xs) as ys, _) then ...
 128       ...
 129     end
 130
 131
 132 ### $ Syntax ###
 133
 134 Haskell has a useful bit of syntax that we will adopt. They use `$` as an infix operator that has the same kind of effect as Russell &amp; Whitehead's period. It is semantically inert, and only affects grouping. It enables you to avoid some parentheses in lots of situations. For example, if you want to check that a sequence `xs` is not empty, you'd express that like this:
 135
 136     not (empty? xs)
 137
 138 but using the `$` syntax, you could instead write:
 139
 140     not $ empty? xs
 141
 142 with the same meaning. This may look funny at first, but you'll get used to it quickly. In an expression like this:
 143
 144     (not $ empty? xs) or p
 145
 146 of course it's only `empty? xs` that gets negated. The `$`'s effect stops when we get to the closing `)`.
 147
 148 Another way to think of `$` is as not being semantically inert, but rather as being an infix operator that expresses *function application*. Normally function application is expressed just by concatenation, so that `f x` expresses the result of applying (the value of) `f` to (the value of) `x`. But `f $ x` expresses function application too. It's useful because it's parsed a bit differently than simple concatenation. This expression:
 149
 150     not empty? xs
 151
 152 is parsed as (the nonsensical):
 153
 154     (not empty?) xs
 155
 156 but:
 157
 158     not $ empty? xs
 159
 160 is parsed as:
 161
 162     not (empty? xs)
 163
 164
 165 ### Some common functions ###
 166
 167 Function composition, which mathematicians write as `f` &cir; `g`, is defined as &lambda; `x. f (g x)`. This notion is one you'll commonly be encountering in functional programming, so it's handy to have a short and clear notation for it. Haskell expresses this relation using a period, like this: `f . g`. But we are using the period for other purposes, as in our &lambda;-constructions; and even Haskell gets into some awkwardness because they use it in other ways too. Perhaps we could use a simple `o` as an infix composition operator? I'm not sure if that would be clear enough. For the time being, I'm electing to write this notion as `comp`, but as an infix expression, so we write: `f comp g` to mean &lambda; `x. f (g x)`. We may revisit this notational proposal later.
 168
 169 We've already come across the `id` function, namely &lambda; `x. x`.
 170
 171 Other common functions are `fst`, which takes two arguments and returns the first of them; `snd`, which takes two arguments and returns the second of them; and `swap`, which takes two arguments and returns them both but with their positions swapped. A fourth function is `dup`, which takes one argument and returns it twice.
 172 These functions can be defined like this:
 173
 174     let
 175       fst (x, y) = x;
 176       snd (x, y) = y;
 177       swap (x, y) = (y, x);
 178       dup x = (x, x)
 179     in (fst, snd, swap, dup)
 180
 181
 182 <a id=sections></a>
 183 ### Sections ###
 184
 185 OCaml and Haskell have a convenient bit of syntax for the common case where you want a function like this:
 186
 187     lambda x. 10 - x
 188
 189 or like this:
 190
 191     lambda x. x & ys
 192
 193 or like this:
 194
 195     lambda (x, y). x + y
 196
 197 They permit you to appreviate the first &lambda;-expression as simply `(10 - )`. We know there's an argument missing, because the infix operator `-` demands two arguments, but we've only supplied one. So `(10 - )` expresses a function that takes an argument `x` and evaluates to `10 - x`. In other words, it expresses &lambda;`x. 10 - x`. Similarly, `( & ys)` expresses a function that takes an argument `x` and evaluates to `x & ys`.
 198
 199 All of this only works with infix operators like `-`, `&` and `+`. You can't write `(1 swap)` or `(swap 1)` to mean &lambda;`x. swap (1, x)`.
 200
 201 Can you guess what our shortcut for the last function will be? It's `( + )`. That
 202 expresses a function that takes two arguments `(x, y)` and evaluates to `x + y`.
 203
 204 Wait a second, you say. Isn't that just what `+` does *already*? Why am I making a distinction between `+` and `( + )`? The difference is that bare `+` without any parentheses is an *infix* operator that comes between its arguments. Whereas when we wrap it with parentheses, it loses its special infix syntax and then just behaves like a plain variable denoting a function, like `swap`. Thus whereas we write:
 205
 206     x + y
 207
 208 if we want to use `( + )`, we have to instead write:
 209
 210     ( + ) (x, y)
 211
 212 It may not be obvious now why this would ever be useful, but sometimes it will be.
 213
 214 All of these shorthands `(10 - )`, `( & ys)` and `( + )` are called "sections". I don't know exactly why.
 215
 216 Confession: actually, what I described here diverges *a bit* from how OCaml and Haskell treat `( + )`. They wouldn't really write `( + ) (x, y)` like I did. Instead they'd write `( + ) x y`. We will look at the difference between these next week.
 217
 218