----- Forwarded message from Jim Pryor ----- Some notes on notation Probably we won't want to explicitly give this to them until week 2 (or later), but I jotted it down now so that we might agree about it and stick to it. These languages have VALUES of various TYPES. For instance, the number 2 is a value of the type integer. The string "horse" is a value of the type string. Functions are also values. Their type depends on the values of their domain and range. For instance, the successor function is a value of the type integer -> integer. The plus function is a value of the type -> integer. I'm fudging a bit here with the ; we'll come back to this and talk about the fudge soon. We can talk equally of expressions having types: they have the types of the values they express. (Sometimes the "untyped" lambda calculus is said to be "singly-typed". It may nor may not be helpful to think of it that way. If its expressions do have a type, though, it's a type T such that functions from T -> T are also of type T. Myself I find it easier to think "untyped.") In some of the languages, one refers to (and sometimes quantifies over) the types in the object language itself; so the object language itself has syntax for talking about types. We'll be discussing this later in the course. For now I'm just saying what notation will eventually be so used. You need to keep straight about the difference between Constants of a given sort, Object-language Variables of the same sort, and Metalinguistic Schemas of that sort. I say "sort" here because you have this same three-way divide with BOTH expressions that express values and expressions that express types. Sticking just to values, here are some examples of constants: 2, 3, +, *, "horse", and so on. Here are some examples of object-language variables: x, y10, first, second_place, and so on (you needn't stick to a single letter). x', x'', and so on are also typically used as variables. All the languages we'll be looking at will permit this syntax. Conventions for metalinguistic schemas vary. In logic textbooks, one often sees lower-case Greek letters used in this way. But in the fields we'll be looking at, those are often used specifically as schematic letters for TYPES. For schematic letters for VALUES, we'll go with the convention that uses Capital Letters. For example: E is some expression of the lambda calculus. E may be "x" or "lambda x: x" or "(lambda x: x y) (lambda z: w)" or whatever. (M N) is another expression of the lambda calculus, where some possibly complex expression M is applied to another possibly complex expression N. and so on. More systematically: value variables in OCaml: start with lower case letter, can be followed by any letter or number or _ or ' value variables in lambda calculus: no fixed rules, but we'll use the OCaml conventions, which is what you commonly see in practice. value variables in Scheme: all the OCaml variables are permitted. Some Schemes are case-sensitive, others aren't, others let you flip a switch to decide. To avoid worrying about it, never use "firstPlace" as a variable if you also use "firstplace" as a variable. Scheme also permits very funky variable names, like "*add*" and "are-there-three-yet?" and so on. Remember, value variables can be bound to values of various types, including functions. value constants in lambda calculus: some lambda calculi have primitive values in them, such as the number 0. But the mainstream ones don't have any primitive values of any sort. All you've got are variables, lambda abstractions of other stuff you've got, and applications of other stuff you've got. If you want numbers or boolean constants, you've got to "encode" them or build them yourself out of the materials just described. We'll be looking at how that's done. in Scheme: simple values: 2, 3; #t, #f; #\h is a character (the eighth letter in the Latin alphabet) function values: +, * in Scheme, like in the lambda calculus, functions always go to the left: so you write (+ 2 3) not 2 + 3. compound non-function values: "h" "horse" and "" are three strings of different length (the string "h" is built from the character #\h but isn't identical to it; these values have different types) '(2 . 3) or (cons 2 3) is an ordered pair '(2 3) or (list 2 3) is a list. Other lists (with different lengths) are (list), (list 2), (list 2 3 4). OCaml: pretty much like Scheme, except that: the ordered pair is written (2, 3) the list is written [2; 3] the eighth character in the Latin alphabet is written 'h' the string consisting just of that character is written "h", as in Scheme true and false are written like that most functions are written leftmost, as in Scheme and the lambda calculus. But a few familiar binary operators can be written infix: so you can write 2 + 3. You can also write it function-leftmost, like so: ( + ) 2 3 Here the parentheses are needed so that the OCaml parser can figure out what you're doing. Aside on parentheses: Parentheses are often optional in lambda caluli and also in OCaml. If M, N, and P are syntactically simple expressions, you can write ((M N) P), but you don't have to: you can also just write M N P. These mean exactly the same thing. To write M (N P) you do need the parentheses. In Scheme, EVERY parenthesis is meaningful and they are never optional. Typically parentheses mean to "apply this function to these arguments". So to add 2 and 3, you write (+ 2 3). In Scheme, this is different from ((+ 2) 3) --- we'll talk about this later. For the purposes of our course, it would be more convenient if Scheme behaved more like lambda calculi and OCaml here. However, parentheses sometimes also do other duty in Scheme: if you print out the list (list 2 3) (also writable '(2 3)), the interpreter will display: (2 3). And the pair (cons 2 3) will display: (2 . 3). The parentheses here don't mean that 2 is being used as an argument. However, if you asked Scheme to "evaluate" one of these, it would think you were using 2 as a function. There are good reasons why parentheses are used in "both" of these ways in Scheme. We'll leave them for you to discover and read about. Another example of parentheses in Scheme: most of the parentheses in (let ((x 2)) (+ x 1)) don't play the role of applying functions to arguments. Back to notation... We've described value variables and value constants in all of these languages. As we said, we'll follow the convention of using capital Latin letters as metalinguistic schemas for value expressions in each of the languages. Typically these schemas can stand for expressions of any syntactic complexity. When we want them to stand for syntactically simple expressions, we'll say so. When we get to types, we'll use small Greek letters as metalinguistic schemas for type expressions (of any syntactic complexity, unless we say otherwise). There are no type expressions or variables in standard Scheme. Type expressions in OCaml---whether type variables or constants representing predefined types---are usually written with lower-case expressions (following the same rules as object-language variables). So "stack" in OCaml could designate a type or a value; it depends on the linguistic context. OCaml (and all languages of its ML family) does have a special rule for type variables bound by quantifiers. Here you use 'x, 'stack, and so on. It looks like the prefix ' used in Scheme, but it's used very differently. Also, don't confuse the type variable 'x with the value variable x'. Aside: What's the difference between the type variable stack and the type variable 'stack? You use stack when you're binding it to a specific type value. So you might say: type stack = HERE SOME COMPLEX EXPRESSION THAT EXPRESSES A TYPE On the other hand, you use 'stack when you're only binding the variable with a quantifier: when you're saying such-and-such holds for EVERY type 'stack... It would be good for us to fix on conventions for object-language type constants and variables in the lambda calculi (and to use something other than small Greek letters, because they are metalanguage schemas). I don't know whether there are uniform conventions about this. ----- End forwarded message ----- ----- Forwarded message from Jim Pryor ----- 1. Introducing and pounding on difference between declarative and imperatival computation. 2. Some geography Haskell Scheme (functional parts) OCaml (functional parts) | | | | | | Untyped lambda calculus | | | Combinatorial logic | | | ============ Turing complete (briefly explain)======================= | more advanced type systems | e.g. polymorphic types (System F) dependent types simply typed lambda calculus what linguists standardly use Imagine in third column to the right of the others: Typical languages like C, Python Imperatival parts of Scheme, OCaml 3. How the same thing looks in each of these: in untyped lambda: add three two (if those are predefined, e.g. three defined as \sz.s(s(sz)) in combinatorial logic: too long, pieces look like: S(BKI)K in Scheme: (let* [(three 3) (two 2)] (+ three two)) in OCaml: let three = 3 in let two = 2 in three + 2;; in C: int three = 3; int two = 2; three + two; Mutation in C: int x = 0; x = 1; in OCaml: let x = ref 0 in x := 1;; in Scheme: (let* [(x (box 0))] (set-box! x 1)) Pound more on how this is different, and we won't be looking at until a lot of groundwork has been laid. 4. Question: how can mutation be something fundamentally new if some of these other languages without it are Turing complete? How are they able to encode/represent the same computations, without using mutation? (And in fact, Turing completeness is more than is needed for this.) Say a few words to sketch / reserve a place for an answer that they should be able to appreciate at the end of the course. 5. Commonly people think of programming/computation as a sequence of changes. But in the languages we'll first be looking at, there are no changes. And moreover, there isn't at the beginning a very useful notion of sequencing, either. You can write sequences: in Scheme: (begin M N P Q R) in OCaml: begin M; N; P; Q; R end or: (M; N; P; Q; R) however, these just have the value of R. So if none of the earlier expressions have any side-effects / make any changes, a sequence ending in R means just the same as R. Aside: I'm fudging a bit here: there may also be a difference if some of the earlier expressions in the sequence are incomputable, and so have what we'll later be calling the "bottom" value. More on this when we talk about evaluation order in a few classes. Another difference is that sequencing is explicit about what order the expressions get evaluated in, and this can affect computational efficiency. But that's not about their semantics.) There's a bit of syntactic shorthand, which I'll introduce in a bit, that LOOKS LIKE a kind of sequencing. However, once you understand what it's shorthand for, you'll see that fundamentally, there's no sequencing going on. 6. Instead of sequencing, we just have the evaluation/simplification/"reducing" of single, often syntactically complex, expressions. Later, when we introduce changes and sequencing, we'll do so in a way that makes that a new piece of syntactic structure. More "how to say the same thing in different languages" 1. Anonymous functions Lambda: \x.B for example, \x. times x five here times and five are free variables, we suppose they're bound to appropriate values by the surrounding linguistic context Scheme: (lambda (x) B) for example, (lambda (x) (* x 5)) -- perhaps aside about role of parens in Scheme should go here instead OCaml: fun x -> x * 5 or: fun x -> ( * ) x 5 2. Binding variables to values with "let" in Scheme: (let* [(x A) (y A')] B) explain that Scheme also has a simple "let", which sometimes has a different meaning from let*, and that they usually will want let*, so we'll stick to using that in our pedagogy explain that there's also a letrec, which does automatically something that we want them to learn how to do by hand so for the time being we'll avoid it in OCaml: let x = A in let y = A' in B for example: let x = 3 + 1 in ( * ) x 5 explain OCaml also has "let rec", which we'll also avoid for now 3. binding with "let" same as supplying as an argument to a function Scheme's (let* [(x A)] B) and OCaml's let x = A in B are the same as: in Lambda: (\x.B) A in Scheme: ((lambda (x) B) A) in OCaml: (fun x -> B) A 4. Functions can also be bound to variables in Scheme: (let* [(f (lambda (x) B))] ...) in OCaml: let f = fun x -> B in ... OCaml shorthand: let f x = B in ... 5. This in Scheme: (let* [(f (lambda (x) B))] (f A)) as we've said, means the same as: ((lambda (x) B) A) and also means the same as: (let* [(x A)] B) Similarly, this in OCaml: let f = fun x -> B in f A or in shorthand: let f x = B in f A as we've said, means the same as: (fun x -> B) A and also means the same as: let x = A in B 6. What if you want to do something like this: (let* [(x A)] ...rest of the file or interactive session...) let x = A in ...rest of the file or interactive session... You can override these bindings: (let* [(x A)] (let* [(x A')] B)) is the same as: (let* [(x A')] B) The closest/innermost binding of x "shadows" or overrides more outer bindings. Scheme and OCaml have shorthands for wrapping a binding around the rest of the file or interactive session. In Scheme it's written like this: (define x A) ...rest of file or session... In OCaml it's written like this: let x = A;; ...rest of file or session... This is the construction that I said misleadingly looked like sequencing. Fundamentally, though, it's just a syntactic shorthand for complex let-expressions that don't involve any sequencing. 7. One more bit of shorthand, which is very often used. This in Scheme: (define f (lambda (x) B)) can be written like this: (define (f x) B) Similarly, this in OCaml: let f = fun x -> B;; can be written like this: let f x = B;; (Compare how "let f = fun x -> B in ..." can be written "let f x = B in ...") Subject: Concatanation (!= sequencing) When we concatenate expressions in these languages, we'll usually be treating the leftmost expression as a function, and the rest of the expressions as arguments to it. M N P in lambda and OCaml treat M as as the function, and N as its first argument. M N should evaluate to another function, which takes P as its first argument. So this is equivalent to: (M N) P If you want M (N P) instead, you need to explicitly use parentheses. Explain currying: add three two = (add three) two In Scheme, on the other hand, in the first place you always need to include parentheses to force a function to be applied to arguments: (M N P) Moreover, there's a difference between ((M N) P) and (M N P). And each of those are different from ((M) N P). Scheme differentiates between functions that take one argument, those that take zero arguments (these are called "thunks"), and those that take more than one argument. This is awkward for our pedagogy, especially when you play with lambda calculus inside Scheme. We'll try to do some special tricks that will let ((M N) P) and (M N P) come out in Scheme when you'd expect them to. We'll tell you about this later. Subject: Constants / predefined variables In Scheme we have: #t and #f (Scheme uses # to start lots of special values, e.g. #\h) 2, 3 +, * Syntax for building compound values: '(2 . 3) or (cons 2 3) is a pair; '(2 3) or (list 2 3) is a list. Other lists: '() or (list) In OCaml we have true, false 2, 3 +, * in OCaml most functions are written to the left of their arguments, as in lambda and Scheme. But common arithmetic operations are infix unless they're enclosed in parens. So you'd write: 2 + 3 or: ( + ) 2 3 (Note: if you do this, best to always write ( + ) not (+). Sometimes it doesn't matter but sometimes it does because of a language design issue.) pairs: (2, 3) lists: [2; 3] others: [] In lambda, we have: Nothing! Some lambda calculi do have constants, such as numbers. But mainstream ones don't. If you want boolean values, or numbers, or ordered pairs, or anything, you'll have to make them by hand out of the raw stuff of the lambda calculus. How do we build our own? We'll start with boolean values Next we'll do pairs, triples, and so on = possibly type-heterog collections, of a specific length >= 2 Next class we'll do numbers We'll also do lists = ordered type-homogeneous collections permitting repetition, of arbitrary length >= 0 Building multisets = unordered type-homog collections permitting rep sets = unordered type-homog collections without rep is harder. In these languages, functions are quite basic, ordered pairs are harder, sets harder yet. Inverts the usual conceptual order. Booleans if B is a boolean value (simple or syntactically complex), and IFTRUE and IFFALSE are any two other, arbitrary expressions, we want: B IFTRUE IFFALSE to evaluate either to IFTRUE or to IFFALSE (depending on which boolean value B is). For instance: equal? two three X Y should evaluate to Y. The boolean constant true needs to be an expression such that: true X Y always evaluate to X. How to do that? Easy: define true to be: \x \y. x Or more mnemonically: \t \f. t Ordered pairs If p is an ordered pair, we want it to accept a function of the form (\x y. B) which does something with its component values. For example: p (\x y. x) should return the first member of p. p (\x y. add x y) should return the sum of the members of p (assuming an appropriate binding for add). Note: `p (\x y. add x y)` could also be written as `p add` How to do this? Easy: define p to be (\f. f first_value second_value) then p (\x y. B) is (\f. f first_value second_value) (\x y. B) is (\x y. B) first_value second_value Some comments here about CPS. ----- End forwarded message ----- ----- Forwarded message from Jim Pryor ----- Date: Sun, 12 Sep 2010 20:13:46 -0400 From: Jim Pryor To: Chris Barker Subject: Other... I've got some other notes pounding on the declarative / imperatival difference, that I'd want to work into discussion preceding everything I sent you. Other ideas I had for homework were just doing some beta-reductions and so on. I can look for some Scheme exercises in Little Schemer or another tutorial we've linked to. I'm sorry, I haven't really come up with much more specific yet here. Miscellany: I've got a pet peeve about the standard presentation of the grammar for lambda calculus as this simple, three rule system. Oh and then by the way here's also this unofficial shorthand. It's fine to give them that, but I also want to mention you can just give a mildly more complex grammar for what people really write. And we can point them to that. I worked it out myself, quickly, and could have missed some cases, but I think I have it. If you know of another statement of this, we could cross-check it sometime. I want to de-emphasize the role of alpha-conversion. It's perfectly reasonable to regard \x.x and \y.y as syntactically identical. I think Hankin has discussion of three ways to think about this, only one of which makes alpha-conversion any substantive transformation. I don't mind mentioning eta-reduction, but I want to signal that it makes a difference to the calculus whether that's permitted, say that we'll be discussing it some later, and for now they shouldn't think much about eta-reduction. Pedagogically I'd like to suppress eta-reduction by default, except when we explicitly allow it. ----- Forwarded message from Chris Barker -----