From a784c75fd837416be62b94ea7a91b87eb8ab3ef3 Mon Sep 17 00:00:00 2001
From: jim
Date: Thu, 5 Feb 2015 12:54:17 0500
Subject: [PATCH] next chunk of content

rosetta1.mdwn  447 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 425 insertions(+), 22 deletions()
diff git a/rosetta1.mdwn b/rosetta1.mdwn
index 5e9c0528..4576a9b6 100644
 a/rosetta1.mdwn
+++ b/rosetta1.mdwn
@@ 39,9 +39,13 @@ These block comments also nest. Another form of block comments is `#;( ... )`. T
### Variable names
+### Variables
% Haskell's division into letterbased vs operators. Each can be "flagged" to temporarily behave as though it belonged to the other syntactic category (see below).
+Our [[syntax for variablestopics/week1#variables]] in Kapulet is close to that in the other languages. Haskell and OCaml differ only in that they do not permit trailing `?` or `!`; however, they do permit trailing `'`s (and even permit `'`s *in the middle* of a variable too, which Kapulet does not). Scheme permits all of these characters, plus many more punctuation symbols as well, to occur anywhere in a variable. Scheme also permits variables to begin with capital letters, or to consist solely of the single character `_`; but the other languages reserve these terms for special purposes.
+
+In addition to the variables made of letters (more properly, of alphanumerics), Haskell and OCaml and Kapulet also permit some variables made exclusively of punctuation symbols, like `<` or Haskell's `>=>` and `<$>`. In Haskell, these always have infix syntax, and the variables made of letters never do. (But the former can have their infix syntax suppressed with parentheses, and the latter can be "flagged" to temporarily take on infix syntax, as we'll discuss below.)
+
+In OCaml and Kapulet, some variables made of letters also have infix syntax, such as `comp` in Kapulet or `mod` in OCaml. I haven't presented to you the complex mechanisms needed to declare this.
@@ 56,9 +60,9 @@ not:
+ 1 2
Although all three of these languages permits you to enclose an infix operator in parentheses to make a *section*, which no longer has infix syntax. In Kapulet, `( + )` is the same as λ `(x, y). x + y`, whereas in OCaml and Haskell it's a *curried* function, which we can write (in Kapulet syntax) as λ `x y. x + y`.
+Although all three of these languages permits you to enclose an infix operator in parentheses to make a *section*, which no longer has infix syntax. In Kapulet, `( + )` is the same as λ `(x, y). x + y`, whereas in OCaml and Haskell it's a *curried* function, which we can write (in Kapulet syntax) as λ `x y. x + y`. We'll discuss sections and curried functions below.
Kapulet and OCaml have some operators spelled with letters also taking infix syntax, such as `comp` in Kapulet or `mod` in OCaml. In Haskell, this is never the case: variables that are made of letters are only treated as function terms being applied to arguments *when they're at the start* of a list of expressions; and variables that are made of punctuation symbols, and not enclosed in parentheses, will only be treated as infix operators. However, Haskell permits you to temporarily "flag" a lettermade function term to behave like an infix operator, by enclosing it in `` ` `` marks. Thus in Haskell you can write:
+Kapulet and OCaml have some operators spelled with letters also taking infix syntax, such as `comp` in Kapulet or `mod` in OCaml. In Haskell, this is never the case: variables that are made of letters are only treated as function terms being applied to arguments *when they're at the start* of a list of expressions; and variables that are made of punctuation symbols, and not enclosed in parentheses, will only be treated as infix operators. However, Haskell permits you to temporarily "flag" a function term made of letters to behave like an infix operator, by enclosing it in `` ` `` marks. Thus in Haskell you can write:
3 `mod` 2
@@ 72,13 +76,16 @@ and the like. Moreover, in Scheme parentheses are never optional and never redun
((+) 3 2)
what that would mean is that `+` is first being applied to *zero* arguments, which is different from not applying it all. (In Kapulet, OCaml, and Haskell, one would write that `f` is being applied to "zero arguments" like this: `f ()`.) Scheme helpfully defines the result of applying `+` to zero arguments to be `0`. So `((+) 3 2)` would evaluate to whatever `(0 3 2)` does, and that's an error, because `0` is not a function.
+what that would mean is that `+` is first being applied to *zero* arguments, which is different from not applying it all. (In Kapulet, OCaml, and Haskell, one would write that `f` is being applied to "zero arguments" like this: `f ()`. FIXME) Scheme helpfully defines the result of applying `+` to zero arguments to be `0`. So `((+) 3 2)` would evaluate to whatever `(0 3 2)` does, and that's an error, because `0` is not a function.
Note that `(0 3 2)`, although it *is*, qua expression, a list of numbers, does not evaluate to a list. To get an expression that *evaluates to* that list, you'd have to use `(list 0 3 2)` or `'(0 3 2)`. (Notice the initial `'`.)
+Note that `(0 3 2)`, although it *is*, qua expression, a list of numbers, does not evaluate to a list. To get an expression that *evaluates to* that list, you'd have to use `(list 0 3 2)` or `'(0 3 2)`. (Notice the initial `'`.) FIXME
In Scheme, you can also write `(+ 3 2 10)`, and so on. You only have to write `(+ (+ 3 2) 10)` if you really want to.
% Parentheses have other roles in Scheme, too.
+Parentheses have many other roles in Scheme, as well; they're a ubiquitous part of the syntax, and don't always express function application. You might sometimes feel they are overused.
+
+You may sometimes see `[ ... ]` being used instead of `( ... )`. This is just a stylistic variant, they work exactly the same. The official Scheme standard doesn't permit that, but most Scheme implementations do. It can help keep track of which closing `]` or `)` goes with which opening `[` or `)`. The opening and closing symbols always have to correspond.
+
In Scheme, the default style for defining functions is as taking several arguments simultaneously, that is the *uncurried* style. In OCaml and Haskell, the default style is to define them *curried*. Curried functions can easily be partially applied:
@@ 112,12 +119,13 @@ Here the last displayed line will fail, because `add` expects as its argument a
Kapulet essentially works like OCaml and Haskell; though for pedagogical reasons I started out by introducing uncurried definitions, rather than the *curried* definitions those other languages predominantly use.
As we mentioned, in Kapulet, OCaml, and Haskell, there is a shorthand that enables you to write things like:
+[[As we mentionedtopics/week1_advanced_notes#sections]], in Kapulet, OCaml, and Haskell, there is a shorthand that enables you to write things like:
# Kapulet
let
ten_minus match lambda x. 10  x;
 and_ys match lambda x. x & ys
+ and_ys match lambda x. x & ys;
+ plus match lambda (x, y). x + y
in (ten_minus, and_ys)
like this:
@@ 125,7 +133,8 @@ like this:
# Kapulet
let
ten_minus match (10  );
 and_ys match ( & ys)
+ and_ys match ( & ys);
+ plus match ( + )
in (ten_minus, and_ys)
There are just minor differences between these languages. First, OCaml doesn't have the `( + 10)` or `(10 + )` forms, but only the `( + )`. Second, as a special case, OCaml doesn't permit you to do this with its listconsing operator `::`. You have to write `fun x xs > x :: xs`, not `( :: )`. Whereas in Kapulet `( & )`, `(x & )`, and `( & xs)` are all sections using its sequenceconsing operator `&`; and in Haskell, `( : )`, `(x : )`, and `( : xs)` are the same.
@@ 156,13 +165,23 @@ I know all these languages fairly well, and I still find this last issue difficu
The relation that's written `==` in Kapulet is also written that way in Haskell. That symbol means something else in OCaml, having to do with mutable reference cells; to get the same notion in OCaml one writes just a single `=`. The negation of this notion is written `!=` in Kapulet, `/=` in Haskell, and `<>` in OCaml. (Again, `!=` means something else in OCaml.)
The relations that are written `and`, `or`, and `not` are written in Haskell and OCaml as `&&`, ``, and `not`. (Haskell uses `and` and `or` to express functions that form the conjunction or disjunction of every `Bool` value in a List of such. OCaml permits `or` as an old synonym for ``, but discourages using that spelling. OCaml also permits `&` as an old, discouraged synonym for `&&`.)
+The relations that are written `and`, `or`, and `not` in Kapulet are written the same way in Scheme. Note that in Scheme the first two can take zero or more arguments:
+
+ ; Scheme
+ (and)
+ (and bool1)
+ (and bool1 bool2)
+ (and bool1 bool2 bool3)
+
+As you'd expect `(and bool1)` evaluates the same as plain `bool1`; similarly with `(or bool1)`. What do you think `(and)` with no arguments should evaluate to? How about `(or)`?
Scheme %%
+These relations are written in Haskell and OCaml as `&&`, ``, and `not`. (Haskell uses `and` and `or` to express functions that form the conjunction or disjunction of every `Bool` value in a List of such. OCaml permits `or` as an old synonym for ``, but discourages using that spelling. OCaml also permits `&` as an old, discouraged synonym for `&&`.)
The values that are written `'true` and `'false` in Kapulet are written in Haskell as `True` and `False`, and in OCaml as just `true` and `false`. They're written `#t` and `#f` in Scheme, but in Scheme in many contexts any value that isn't `#f` will behave as though it were `#t`, even values you might think are more "falseish", like `0` and the empty list. Thus `(if 0 'yes 'no)` will evaluate to `'yes`.
Scheme recognizes the values `'true` and `'false`, but it treats `'false` as distinct from `#f`, and thus as a "trueish" value, like all of its other values that aren't `#f`.
+Some Scheme implementations, such as Racket, permit `#true` and `#false` as synonyms for `#t` and `#f`.
+
+Scheme also recognizes the values `'true` and `'false`, but it treats `'false` as distinct from `#f`, and thus as a "trueish" value, like all of its other values that aren't `#f`. Kapulet essentially took Scheme's `boolean` values and collapsed them into being a subtype of its `symbol` values. FIXME also with chars.
@@ 216,8 +235,8 @@ Here are some list functions in Kapulet:
dropwhile
reverse
# new functions
 concat # converts [[10, 20], [30], [], [40, 50]]
 # to [10, 20, 30, 40, 50] (only merging a single layer of []s)
+ join # converts [[10, 20], [30], [], [40, 50]]
+ # to [10, 20, 30, 40, 50] (but only "joining" a single layer of []s)
(mem) # infix syntax, 2 mem [1, 2, 3] == 'true
nth # nth [10, 20, 30] 1 == 20, because the first element
# is at position 0; fails if index is out of bounds
@@ 241,11 +260,12 @@ Here are the corresponding functions in Haskell:
map
zipWith { zip handles the special case where f is the function that forms ordered pairs
both zipWith and zip stop with the shortest list }
 unzip  doesn't take an f argument, assumes (\(x, y) > (x, y))
+ unzip { unlike unmap2, doesn't take an explicit f argument
+ just assumes it's (\(x, y) > (x, y)) }
takeWhile
dropWhile
reverse
 concat
+ concat  corresponding to join
elem  not infix syntax, but often written as: 2 `elem` [1, 2, 3]
(!!)  infix syntax: [10, 20, 30] !! 1 == 20; fails if index is out of bounds
all p xs
@@ 269,14 +289,14 @@ Here they are in OCaml:
List.split (* like Haskell's unzip, doesn't take an f argument *)
(* no function corresponding to takewhile or dropwhile *)
List.rev
 List.concat (* also List.flatten, which still only merges a single layer of []s *)
+ List.concat (* also List.flatten, which still only "joins" a single layer of []s *)
List.mem (* not infix syntax *)
List.nth (* List.nth [10; 20; 30] 1 = 20; fails if index is out of bounds *)
List.for_all p xs
List.exists p xs
How does all this look in Scheme? Well, Scheme has a notion they call a (proper) `list`, and also a notion they call a `vector`. There are also what Scheme calls "improper" `list`s, with `(cons 1 'nonlist)` or `'(1 . nonlist)`, where `'nonlist` is any nonlist (here it's a `symbol`) being a special case. Let's ignore the improper `list`s. Scheme's (proper) `list`s and `vector`s each has a claim to correspond to Kapulet's sequences, Haskell's Lists, and OCaml's `list`s. Each is also somewhat different. The dominant differences are:
+How does all this look in Scheme? Well, Scheme has a notion they call a (proper) `list`, and also a notion they call a `vector`. There are also what Scheme calls "improper" `list`s, with `(cons 1 'nonlist)` or `'(1 . nonlist)`, where `'nonlist` is any nonlist (here it's a `symbol`) being a limiting case. Let's ignore the improper `list`s. Scheme's (proper) `list`s and `vector`s each has a claim to correspond to Kapulet's sequences, Haskell's Lists, and OCaml's `list`s. Each is also somewhat different. The dominant differences are:
1. these structures in Scheme can contain heterogenouslytyped elements, including further `list`s and `vector`s in some positions but not in others
2. in the official Scheme standard, `list`s and `vector`s are both *mutable* containers, that is, one and the same persisting `list` structure can have different
@@ 320,7 +340,7 @@ Here are the `list` functions in Scheme corresponding to the functions listed in
; can take one or more list arguments
; no official function corresponding to unmap2 or takewhile or dropwhile
reverse
 ; no official function corresponding to concat
+ ; no official function corresponding to join/concat
member ; corresponds to Kapulet's (mem) and Haskell's elem
(listref xs k) ; corresponds to Kapulet's `nth xs k`
; no official function corresponding to all or any
@@ 329,9 +349,382 @@ All of the functions listed as missing from the official Scheme standard can be
+### Patternmatching
+
+The complex expression that's written like this in Kapulet:
+
+ # Kapulet
+ case some_expression of
+ 0 then result0;
+ 1 then result1;
+ x then resultx
+ end
+
+is written very similarly in Haskell:
+
+  Haskell
+ case some_expression {
+ 0 > result0;
+ 1 > result1;
+ x > resultx
+ }
+
+Unlike the other languages we're discussing, Haskell pays special attention to the whitespace/indentation of what you write. This permits you to omit the `{`, `;`, and `}`s in the above, if you've got the indentation right. And that's how you will often see Haskell code displayed. On this website, though, I propose to always include the `{`s and so on when displaying Haskell code, because the indentation rules aren't 100% intuitive. It's easy to read properlyindented Haskell code, but until you've learned and practiced the specific rules, it's not always easy to write it.
+
+This is written only a little bit differently in OCaml:
+
+ (* OCaml *)
+ match some_expression with
+ 0 > result0 
+ 1 > result1 
+ x > resultx
+
+Note there is no closing `end` or `}`. You can enclose the whole expression in parentheses if you want to, and when embedding it in some larger expressions (like another `match` expression), you may need to. Sometimes the `` dividers are written at the start of a line, and you are allowed to include an extra one before the first line, so you could also see this written as:
+
+ (* OCaml *)
+ match some_expression with
+  0 > result0
+  1 > result1
+  x > resultx
+
+The syntax for [[guardstopics/week1_advanced_notes#guards]] and [[aspatternstopics/week1_advanced_notes#aspatterns]] also only varies slightly between these languages:
+
+ # Kapulet
+ case some_expression of
+ pat1 when guard then result1;
+ pat1 when different_guard then result2;
+ ((complex_pat) as var, pat4) then result3
+ end
+
+  Haskell
+ case some_expression {
+ pat1  guard > result1;
+  different_guard > result2;
+ (var@(complex_pat), pat4) > result3
+ }
+
+ (* OCaml *)
+ match some_expression with
+ pat1 when guard > result0 
+ pat1 when different_guard > result1 
+ ((complex_pat) as var, pat4 > result3
+
+
+The official Scheme standard only provides for a limited version of this. There is a `case` construction, available since at least "version 5" of the Scheme standard (r5rs), but it only accepts literal values as patterns, not any complex patterns containing them or any patterns containing variables. Here is how it looks:
+
+ ; Scheme
+ (case some_expression
+ ((0) 'result0)
+ ((1) 'result1)
+ ((2 3 5) 'smallprime)
+ (else 'toobig))
+
+The results can be complex expressions; I just used bare symbols here for illustration. Note that the literal patterns in the first two clauses are surrounded by an extra pair of parentheses than you might expect. The reason is shown in the third clause, which begins `(2 3 5)`. This does not mean to match a list containing the values `2` `3` and `5`. Instead it means to match the simple value `2` *or* the simple value `3` *or* the simple value `5`. The final `else` clause is optional.
+
+The patterns here can be any literal value (what the Scheme standards call a "datum"). Numbers are permitted, as are boolean literals (`#t` and `#f`) and symbolic atoms (`'alpha` and the like, though inside a pattern position in a `case`expression, you omit the initial `'`). You can also use the list literal `'()` (again, omit the initial `'` when writing it as a pattern). Some implementations of Scheme allow more complex list patterns, matching literal lists like `'(alpha 0 () #t)`; others don't.
+
+There are various addon libraries to Scheme that will permit you to patternmatch in more ambitious ways, approximating what you can do in Kapulet, OCaml, and Haskell. We will explain some of these later, after we've introduced you to the notion of *datatypes*.
+
+What programmers using standard Scheme tend to do instead is to use *predicates* that query the type and/or structure of an unknown value, and then take separate evaluation paths depending on the result. This can be done with an `if...then...else...` construction, or with Scheme's more general `cond` construction. In Scheme, these two are equivalent:
+
+ ; Scheme
+ (if test1 'result1
+ (if test2 'result2
+ (if test3 'result3 'somethingelse)))
+
+ (cond
+ (test1 'result1)
+ (test2 'result2)
+ (test3 'result3)
+ (else 'somethingelse))
+
+The tests tend to use predicates like `null?` (are you the empty list?), `pair?` (are you a nonempty list, whether proper or improper?), `list?` (are you a proper list, whether empty or not?), `symbol?`, `boolean?`, `number?`, `zero?` (you get the idea). But you can also use more complex tests you write on the spot, or use antecedentlydefined functions:
+
+ ; Scheme...in case the parens left any doubt
+ (define smallprime? (lambda (x) (if (= x 2) #t (if (= x 3) #t (if (= x 5) #t #f)))))
+
+ (cond
+ ((= x 0) 'infant)
+ ((smallprime? x) 'myfavorite)
+ ((and (> x 10) (< x 20)) 'teenaged)
+ (else 'unknown))
+
+Remember that in Scheme, an expression doesn't have to evaluate to `#t` to be treated as "truthlike". *Every* value other than `#f` is treated as truthlike. So `(if 0 'zero 'nope)` evaluates to `'zero`.
+
+You may sometimes see Scheme `cond` constructions written with this kind of clause:
+
+ (cond
+ ...
+ (testexpression => functionvalue)
+ ...)
+
+That's the same as the following:
+
+ (cond
+ ...
+ (testexpression (functionvalue testexpression))
+ ...)
+
+Except that it only evaluates the testexpression once.
+
+The clauses in Scheme's `cond` expressions can contain *multiple* expressions after the test. This only becomes useful when you're working with mutable values and sideeffects, which we've not gotten to yet. The `if` expressions only take a single expression for the "then" branch and a single expression for the "else" branch. You can turn a complex series of expressions, which may involve sideeffects, into a single expression by wrapping it in a `(begin ...)` construction. The `(begin ...)` construction as a whole evaluates to whatever the last expression it contains does.
+
+Scheme standards after r5rs also provide two further conditional constructions, which are for the situations where you want to perform a meaningful action only on the "then" branch, or only on the "else" branch:
+
+ (when testexpression
+ resultexpression1...)
+
+ (unless testexpression
+ resultexpression2...)
+
+If the testexpression evaluates to `#f`, then the `when` expression evaluates to a special "void" value; mutatis mutandis for the `unless` expression. This is analogous to `()` in OCaml, Haskell, and Kapulet.
+
+In the last three languages, the expressions in the thenbranch and the elsebranch of a conditional have to have the same type. You can't say `if testexpression then 0 else []`. Also, they expect the testexpression to evaluate specifically to a boolean value, not merely to `'false` or *anything else*. They are stricter about types here than Scheme is.
+
+In the special case where both a thenbranch and an elsebranch evaluate to `()`, and the elsebranch involves no complex expression but merely the literal `()`, then OCaml permits you to omit the elsebranch. So in OCaml you can write this:
+
+ if testexpression then thenresult
+
+instead of
+
+ if testexpression then thenresult else ()
+
+This is similar to Scheme's `when`expression. Kapulet and Haskell have no analogue.
+
+
+
+### Lambda expressions
+
+In Kapulet you write λexpressions (sometimes called "anonymous functions") with a prefix of either λ or the spelledout `lambda`. That's followed by one or more patterns, separated by spaces, then a period, then a single expression which makes up the body of the function. When there are multiple patterns, the function expressed is *curried*, thus:
+
+ lambda (x, y) z. result
+
+means the same as:
+
+ lambda (x, y). (lambda z. result)
+
+The parentheses could have been omitted around `lambda z. result`; they're just there to focus your attention.
+
+Haskell and OCaml are very similar to this, they just use some slightly different notation. In Haskell you'd write:
+
+  Haskell
+ \(x, y) z > result
+
+and in OCaml you'd write:
+
+ (* OCaml *)
+ fun (x, y) z > result
+
+You may sometimes see λexpressions in OCaml written using `function` instead of `fun`. These overlap somewhat in their usage. The difference is that `function` only allocates a position for *one* argument pattern, so can't define curried functions. On the other hand, `function` can take multiple *variant* patterns for that single position. Thus with `function` you can say:
+
+ (* OCaml *)
+ function [] > result1 
+ x::xs > result2
+
+whereas with `fun` you'd have to write:
+
+ (* OCaml *)
+ fun ys > match ys with
+ [] > result1 
+ x::xs > result2
+
+In Scheme, lambda expressions are written like this:
+
+ ; Scheme
+ (lambda (vars...) bodyexpressions...)
+
+Scheme only permits simple variables as its argument patterns here, and the lambdaexpression can be defined with zero or more arguments:
+
+ ; Scheme
+ (lambda () ...)
+ (lambda (x) ...)
+ (lambda (x y) ...)
+ (lambda (x y z) ...)
+
+There is special syntax for defining functions that may take varying numbers of arguments (recall `and` and `+`), to have them bind a single variable to a list containing all of their arguments (or all of the arguments after the third...). I won't explain that syntax here.
+
+
+
+### Let, Letrec, and Define
+Kapulet has the syntax:

@@ 363,6 +755,17 @@ Kapulet's `dup` isn't predefined in Haskell but can be easily expressed as `\x 
(This page is being worked on...)
+
+
+### Further Installments ...
+
+We will expand these comparisons (on separate web pages) as we introduce additional ideas in the course, such as types and monads and continuations.
+
+
+
+
+
+
FIXME

2.11.0