topics/week9_using_the_monad_library.mdwn

   1 [[!toc levels=2]]
   2
   3 ## How to Inspect Juli8's Monad modules for OCaml? ##
   4
   5 If you want to see the "signature" of some OCaml library or "module" --- that is, a list of the types it exports, and also of the types of function values (and other values) that it exports, you can do this.
   6
   7 If you're using a version of OCaml >= 4.02 (you can see the version from `Sys.ocaml_version`), you can just type <code>#show <i>module_name</i>;;</code> or <code>#show <i>name_of_module_type</i>;;</code>. If you're using an older version of OCaml, read on.
   8
   9 If you know the name of the module but not the name of its module type, or if its module type doesn't have a name, then in older versions of OCaml you can do this:
  10
  11     module type SOMETHING = sig include module type of ModuleYouAreInterestedIn end
  12
  13 More commonly, though, you'll know that the module uses a specific *named* module type. In that case, here is what to do. An example is that modules you create using `Monad.Reader(struct type env = ... end)` will all (more-or-less) have the module type named `Monad.READER`. We can see an expansion of `Monad.READER` by writing:
  14
  15     module type SOMETHING = sig include Monad.READER end
  16
  17 Then OCaml will respond:
  18
  19     module type SOMETHING =
  20       sig
  21         type env
  22
  23         type 'a t
  24         type 'a result = env -> 'a
  25         val run : 'a t -> 'a result
  26
  27         val map : ('a -> 'b) -> 'a t -> 'b t
  28         val mid : 'a -> 'a t
  29         val map2 : ('a -> 'b -> 'c) -> 'a t -> 'b t -> 'c t
  30         val mapply : ('a -> 'b) t -> 'a t -> 'b t
  31         val ( >> ) : 'a t -> 'b t -> 'b t
  32         val ( << ) : 'a t -> 'b t -> 'a t
  33         val ( >>= ) : 'a t -> ('a -> 'b t) -> 'b t
  34         val ( >=> ) : ('a -> 'b t) -> ('b -> 'c t) -> 'a -> 'c t
  35         val ( <=< ) : ('b -> 'c t) -> ('a -> 'b t) -> 'a -> 'c t
  36         val join : 'a t t -> 'a t
  37         val ignore : 'a t -> unit t
  38         val seq : 'a t list -> 'a list t
  39         val seq_ignore : unit t list -> unit t
  40         val do_when : bool -> unit t -> unit t
  41         val do_unless : bool -> unit t -> unit t
  42
  43         val ask : env t
  44         val asks : (env -> 'a) -> 'a t
  45         val shift : (env -> env) -> 'a t -> 'a t
  46       end
  47
  48 Here I've reordered the elements a bit from OCaml's actual presentation to make it easier to explain them. First, there is the type of the `env` that you will supply. In this signature it's listed abstractly (no concrete definition of the type is supplied, it's just declared that it exists), but in practice this type will be unified with the envionment type that you supply. Next there is the abstract or opaque type `'a t`. This is the main type of the Reader Monad. Behind the scenes it will be the same as the type `'a result`, that is a function from `env`s to `'a`s. But OCaml keeps the type `'a t` opaque, and only exposes the actual implementation of the type when you apply the `run` function to `'a t` values. The `run` function is for most Monads just implemented as an identity function. Its effect is primarily to shift between the opaque type and the exposed concrete implementation of that type, named `'a result`. Note that even though the `'a t`s and `'a result`s will usually be represented by the same structures in OCaml's memory, our library insists that you only use the opaque `'a t` version for the monadic functions like `map` and `>>=` and so on. On the other hand, when you want to supply an actual starting `env` to your monadic value, you have to use the `'a result` version you got from `run`. This will not work:
  49
  50     # let xx = R.(mid 1) in xx env0;;
  51                             --
  52     Error: This expression has type int R.t
  53            This is not a function; it cannot be applied.
  54
  55 You need to instead say:
  56
  57     # let xx = R.(mid 1) in R.run xx env0;;
  58     - : int = 1
  59
  60 Also there is no translation function that lets you take an `'a result` that you've built by hand and convert it to an `'a t`. You have to build your `'a t`s using the primitive operations that the Monad library provides for doing so. This is all discussed further [[elsewhere|/topics/week8_monads_and_modules/#abstraction]].
  61
  62 The next block of stuff in the `Monad.READER` module type are functions common to all monads. Some monads (such as List and Option) additionally have `mzero : 'a t` and some related operations. Finally, at the end are the operations specific to the Reader monad. These have mostly been given the same names as they have in Haskell's monad libraries. Here is a quick explanation:
  63
  64 *   `ask` is a Reader monadic value that, when given an `env`, returns that very `env` as its payload. So its type is an `env t`. And it is implemented as `fun e -> e`.
  65
  66 *   `asks` is a variant of `ask`. Haskell several times uses the convention that where `blah` is some operation that gives you some raw value, `blahs` is a variant that gives you the result of applying some *s*elector function or handler to the raw value. In this case, `asks` takes a parameter, which is a function that operates on `env`s and gives some result. `asks handler` then constitutes an `'a t`, where `'a` is the type of the handler's result. For example, if my environments are associations of some sort between `char`s and `int`s, and `lookup 'x'` is a function that takes an `env` as a further argument and returns what `int` that environment associates the char `'x'` with, then:
  67
  68         let getx = asks (lookup 'x')
  69
  70     will declare an `int t` that, when supplied with an environment, returns the `int` that that environment associates the `char` `'x'` with.
  71
  72 *   `shift` is a function that takes two arguments, an "`env`-shifting" function and some Reader monadic value. It returns a new Reader monadic value that, when supplied with an `env`, will process the original monadic value only with the `env` shifted in the specified way. (Haskell calls the `shift` operation `local`, but I thought it would be a bit better to call it `shift` --- enough of an improvement to justify changing the name.)
  73
  74     For example, if `insert 'x' 1` is a function that operates on `env`s and gives a new `env` which now associates the char `'x'` with `1`, then:
  75
  76         let letx xint body = shift (insert 'x' xint) body
  77
  78     will declare an operation that takes an `int` `xint` and a monadic value `body`, and evaluates `body` in the context <code>let x = <i>xint</i> in ...</code>. If we wanted instead to have a version which accepted not an `int`, but rather an `int` Reader, we could write instead:
  79
  80         let letx xx body = xx >>= fun xint -> shift (insert 'x' xint) body
  81
  82
  83 ## Examples of Using Reader ##
  84
  85 Here are some examples of using the Reader Monad modules to evaluate some simple expressions using bound variables. First, you could look at [[this Haskell code|/code/reader1.hs]]. It `import`s the `Control.Monad.Reader` library, which is where Haskell's Reader monad can be found. It declares an `Env` type that we'll implement as a simple *function* from `Char`s to `Int`s. Then it defines an "empty" environment `env0`, and a function `insert` for adding new bindings to an `env`. Next, we make a general function `getint` that can create monadic values like the `getx` illustrated above. We show how to use `getx` and `gety` to write monadic versions of `y + x` and `3 + x`. Next, we define a `letx` function as illustrated above (the second version, that takes a monadic value `xx` as its argument). We show how to use this to write a monadic version of `let x = 2 in y + x`. The final line of the file applies `runReader` to the monadic value we've built --- this is Haskell's way of doing what we do in OCaml with `run`, namely to remove the abstraction curtain and see what concrete type is really constituting our `Reader Env a`s --- and we supply it with the empty environment, which will be sufficient since the expression we're interpreting has no free variables. Haskell binds the variable `res` to the result. You can run this code inside `ghci` by typing `:load /path/to/reader1.hs`. (You may also be able to say `:add ...` instead of `:load ...`.) Then type `res`, and Haskell will report back `5`.
  86
  87 [[This OCaml code|/code/reader1.ml]] does exactly the same thing only using our OCaml monad libraries instead. The biggest difference from the Haskell version is in the first few lines, where we have to generate a Reader monad module parameterized on the `env` type that we intend to work with.
  88
  89 Here's a more complicated example. This time we want to be able to bind variables to lambda abstracts as well as to `int`s. So our `env` type will need to be more complex; it will have to associate `char`s with a disjoint sum of `int`s and lambda abstracts. Now what will the type of the lambda abstracts be? Let's just restrict our attention to abstracts whose bodies return `int`s. But they might get those `int`s by performing operations on bound variables, so the body expressions need to be interpreted monadically, as `int Reader`s. We'll construe the whole lambda abstract as a  function from `int Reader`s (that is, the monadic values which are provided as arguments to the lambda abstract) to their results, so the lambda abstract will have the type `int Reader -> int Reader`. In OCaml that will be `int R.t -> int R.t`, and in Haskell `Reader Env Int -> Reader Env Int`. Since variables can be bound to either `int`s or to lambda abstracts, we declare our environments like this in OCaml:
  90
  91     type bound = Int of int | Fun of (int R.t -> int R.t)
  92     type env = char -> bound
  93
  94 and like this in Haskell:
  95
  96     data Bound = Int Int | Fun (Reader Env Int -> Reader Env Int)
  97     type Env = Char -> Bound
  98
  99 There is a tricky issue in the OCaml case, though, in that when working with OCaml, we have to *generate* our `R` Reader monad module, parameterized on the type of the `env`, but here we see that we need access to the *type* `'a R.t` from the generated `R` module in order to declare the `env`. Fortunately, it is possible to do this, by having the module that declares the `env` and the module that has our Reader monad in it be mutually recursively defined. The first few lines of [[this OCaml code|/code/reader2.ml]] do the tricky work.
 100
 101 After that, our [[Haskell code|/code/reader2.hs]] and [[OCaml code|/code/reader2.ml]] proceed basically the same, allowing for the difference in syntax and vocabulary between Haskell and OCaml. The `getint` function works like before, except now we have to pull the `int` out from behind the `Int` constructor of our disjoint sum type `bound`. We have a parallel `getfun` function. Then we interpret the variable `x` using the monadic value `getint 'x'`, and we interpret the variable `f` using the monadic value `getfun 'f'`. The `letx` operation is similarly adjusted, and we also have a parallel `letf`.
 102
 103 The really new thing in this code, compared to the previous example, is our definition of a monadic value to interpret the lambda abstract `\y -> y + x`, that `f` gets bound to. And also our interpretation of the expression `f 3`, which looks up a function that the variable `f` is bound to, and then applies it to (a monadically-lifted version of) `3`. (We have the argument be monadically lifted so that we could also say, for example, `f y`.) You can examine the code to see how we do these things.
 104
 105
 106 ## OK, what else is in these Monad modules? ##
 107
 108 I won't give an exhaustive list here. But here is the output of `module type SOMETHING = sig include Monad.BLAH end` for some of the `BLAH`s:
 109
 110
 111     module type STATE =
 112       sig
 113         type store
 114         type 'a t
 115         type 'a result = store -> 'a * store
 116         (* plus the other usual monadic stuff, and: *)
 117         val get : store t
 118         val gets : (store -> 'a) -> 'a t
 119         val modify : (store -> store) -> unit t
 120         val put : store -> unit t
 121       end
 122
 123 The `store` type has to be provided by you, when you generate the module, similarly to as in the Reader monad. The `'a result` type shows the real definition of an `'a State` type, otherwise kept opaque as `'a t`. Instead of the special operations `ask` and so on that the Reader monad has, State has the operations `get`, `gets`, `modify`, and `put`. The first two work just like `ask` and `asks` did for the Reader monad. The third one works *similarly* to `shift` for the Reader monad, with the crucial difference that the rebinding that `shift` introduces is in effect only for the `body` argument of the `shift` operation. Outside of that `body`, we revert to the originally supplied `env`. But notice that `modify` doesn't take any `body` argument. `modify` introduces changes to the supplied `store` that once introduced *stay in place*, until we manually change them again. Thus with the Reader monad you'd do things like this:
 124
 125     R.(xx >>= fun x -> ... shift (insert ...) body >>= fun y -> (* now we're using the unshifted env *) ...)
 126
 127 With the State monad you'd instead do things like this:
 128
 129     S.(xx >>= fun x -> ... modify (fun old_store -> new_store) >>= fun () -> (* we continue using the modified store, until it's modified once again *) ...)
 130
 131 Since the pattern `... >>= fun () -> ...` or `... >>= fun variable_you_never_use -> ...` occurs often when working with monads, there's a shorthand: you can instead say `... >> ...`, with `>>` in place of `>>= fun pattern ->`.
 132
 133 Here's another monad module signature:
 134
 135     module type WRITER =
 136       sig
 137         type log
 138         type 'a t
 139         type 'a result = 'a * log
 140         (* plus the other usual monadic stuff, and: *)
 141         val listen : 'a t -> ('a * log) t
 142         val listens : (log -> 'b) -> 'a t -> ('a * 'b) t
 143         val tell : log -> unit t
 144         val censor : (log -> log) -> 'a t -> 'a t
 145       end
 146
 147 Writer is very similar to Reader: first, it is parameterized on something like an `env`, here called a `log`. (A typical implementation for `log` would be the `string` type.) Second, the Writer operations `listen`, `listens`, and `censor` parallel the Reader operations `ask`, `asks`, and `shift`. But one difference is that with Writer, you cannot choose what initial `env` (`log`) to supply. You always begin with the `empty` `log` (such as `""` for `string`s). A second difference is that the types differ. Compare:
 148
 149     module type READER =
 150       sig
 151         ...
 152         val ask : env t
 153         val asks : (env -> 'a) -> 'a t
 154         val shift : (env -> env) -> 'a t -> 'a t
 155       end
 156
 157 Whereas Writer's `censor` and Reader's `shift` have isomorphic types, there is some extra complextity to Writer's `listen` and `listens`, compared to `ask` and `asks`. What this extra complexity means is that for `Writer`, listening happens only in a local context. You can't `listen` to what got written to the log before you installed your `listen`ing tap. But you can return payloads that are dependent on what you've heard in the local context.
 158
 159 Unlike Reader, Writer also has a `tell` operation, which is akin to the `put` operation in the State monad. The difference is that the `tell` function takes a `log` as argument and *appends* that to the existing `log`. You can't erase or overwrite elements already in the `log`; you can only append to it. However, if you like, you can `censor` the log generated by any local context. (Inside the local context, the log isn't yet censored; the censoring only affects what's seen downstream as the contributions made by that context to the log.)
 160
 161 Here's a complex example that illustrates this. First we will use the Juli8 helper function `String.upper` and a second helper function that we define like this:
 162
 163     let bracket log = "{" ^ log ^ "}"
 164
 165 Next, we construct some monadic values and reveal them at the end using `run`:
 166
 167     module W = Monad.Writer(struct
 168       type log = string
 169       let empty = ""
 170       let append s1 s2 = if s1 = "" then s2 else if s2 = "" then s1 else s1 ^ " " ^ s2
 171     end)
 172     W.(let xx = tell "one" >> listens bracket (tell "two" >> mid 10) in
 173        let yy = censor String.upper (tell "zero" >> listens bracket xx) in
 174        let zz = tell "before" >> yy >>= fun y -> tell "after" >> mid y in
 175        run ...);;
 176
 177 The monadic value `xx` writes `"one"` to the log, then discards the resulting `()` payload (it continues `>> ...` rather than `>>= fun var -> ...`). Then we have a use of `listens`. This will evaluate its body `tell "two" >> mid 10` and return as payload a pair of the body's original payload and a `bracket`ed copy of the local log. Thus the payload of `listens bracket (tell "two" >> mid 10)` will be `(10, "{two}")`. Its log will be `"two"`. The `"one"` that got written to the log earlier isn't accessible to `listens`; however it does stay in the overall log, to which the `listens ...` construction contributes. Hence the result of `run xx`, showing first the payload and then the log, would be:
 178
 179     - : (int * string) W.result = ((10, "{two}"), "one two")
 180
 181 Now `yy` uses that `xx` monadic value to illustrate the use of `censor`. Here we have `censor` apply `String.upper` to the log generated in the local context it's applied to, hence the result of  `run yy` would be:
 182
 183     - : ((int * string) * string) W.result = (((10, "{two}"), "{one two}"), "ZERO ONE TWO")
 184
 185 The final value `zz` shows what happens to entries written to the log before and after the `censor`ing that occurs in `yy`, namely nothing. That is, `run zz` is:
 186
 187     - : ((int * string) * string) W.result = (((10, "{two}"), "{one two}"), "before ZERO ONE TWO after")
 188
 189 Let's look at some more familiar monad signatures. Here is one:
 190
 191     module type OPTION =
 192       sig
 193         type 'a t
 194         type 'a result = 'a option
 195         (* plus the other usual monadic stuff, and: *)
 196         val mzero : 'a t
 197         val guard : bool -> unit t
 198         val test : ('a option -> bool) -> 'a t -> 'a t
 199       end
 200
 201 That is what's exposed in the `Monad.Option` module. In the non-monadic `Option` module, there are many more operations. Similarly, `List` exposes many more operations than `Monad.List` does. The `Monad.List` module restricts us to just the monadic interface. Unlike Reader and State and Writer, the Option and List monads don't need to be parameterized on environments or anything else of that sort. The Option monad has three additional monadic operations, analogues of which are also all present in List. First, there is the `mzero` monadic value, implemented as `None` and satisfying the Monad Laws for `mzero` we explained elsewhere. The key one to remember is that `mzero` aborts a chain of composed Kleisli functions. That is, `mzero >>= anything` is always `mzero`. `guard` takes a boolean argument and if it is false, gives `mzero`. If the argument is true, it just gives the uninteresting `mid ()`, hence the typical way to use `guard` is as:
 202
 203     module O = Option;;
 204     O.(guard some_bool_expr >> more_monadic_stuff)
 205
 206 If `some_bool_expr` is true, then this expression will ignore the `()` payload of `guard some_bool_expr` and go on to compute `more_monadic_stuff`; if it's false, then the whole chain gets ignored because of the distinctive behavior of `mzero`.
 207
 208 The third special operation in the Option monad is `test`. This lets you supply a function that takes an ordinary `'a option` type (that is, one where the abstraction curtain imposed by the `'a O.t` type is not in place) and returns a `bool`. Then you take an Option monadic value (one where the abstraction curtain *is* in place). OCaml will temporarily remove the abstraction curtain on the second argument and see how the function you supplied assesses it. If the result is `true`, then the result is identical to that Option monadic value, unaltered. If the result is `false`, then the result is `mzero`. (For those of you who know Frank Veltman's work on dynamic semantics for epistemic modals, this `test` (or the version of it for sets of worlds) is a key component.)
 209
 210 Here is the List monadic interface:
 211
 212     module type LIST =
 213       sig
 214         type 'a t
 215         type 'a result = 'a list
 216         (* plus the other usual monadic stuff, and: *)
 217         val mzero : 'a t
 218         val guard : bool -> unit t
 219         val test : ('a list -> bool) -> 'a t -> 'a t
 220         val ( ++ ) : 'a t -> 'a t -> 'a t
 221         val pick : 'a t -> ('a * 'a t) t
 222       end
 223
 224 The `mzero` and `guard` and `test` operations work analogously to the ones in the Option monad. The `++` (infix) operation is like `List.append` (OCaml also uses `@` for that), with the difference that `++` is defined on the opaque List monadic values of type `'a Monad.List.t`, not the OCaml native lists (which aren't behind an abstraction curtain). In Haskell, `++` works on either native lists or elements of the List monad, because Haskell doesn't distinguish them. Haskell doesn't impose an abstraction curtain in the case of its List and Maybe monads. `pick` is an operation that transforms (the opaque version of) `[1; 2; 3]` to (the opaque version of) `[(1, [2; 3]); (2, [1; 3]); (3, [1; 2])]`.
 225
 226 Here is another monadic interface:
 227
 228     module type TREE =
 229       sig
 230         type 'a tree
 231         type 'a t
 232         type 'a result = 'a tree
 233         (* plus the other usual monadic stuff, and: *)
 234         val ( ++ ) : 'a t -> 'a t -> 'a t
 235       end
 236
 237 This is the signature/module type for the Monad.LTree module. ("LTree" for *leaf-labeled* trees.)
 238
 239 You can create leaf-only trees using the monadic function `mid`. You can join two trees together using the function `++`, paralleling the one in List. Note that in the Tree case, unlike the List case, `++` is not associative: `xx ++ (yy ++ zz)` is not the same as `(xx + yy) ++ zz`. Nor is there any `mzero` for trees as implemented by this module.
 240
 241
 242 ## Acknowledgements ##
 243
 244 Our library is largely based on the mtl library distributed with the Haskell Platform. That in turn was inspired by Mark Jones' 1995 paper
 245 [Functional Programming with Overloading and Higher-Order Polymorphism](http://web.cecs.pdx.edu/~mpj/pubs/springschool.html).
 246
 247 I've also been helped in various ways by posts and direct feedback from Oleg Kiselyov and Chung-chieh Shan. The following were also useful:
 248
 249  *      <http://pauillac.inria.fr/~xleroy/mpri/progfunc/>
 250  *      Ken Shan "Monads for natural language semantics" <http://arxiv.org/abs/cs/0205026v1>
 251  *      <http://www.grabmueller.de/martin/www/pub/Transformers.pdf>
 252  *      <http://en.wikibooks.org/wiki/Haskell/Monad_transformers>
 253
 254
 255 ## Comparisons with Haskell ##
 256
 257 Some comparisons with the Haskell monadic libraries, which Juli8 mostly follows.
 258
 259 In Haskell, the `'a Reader.t` monadic type would be defined something like this:
 260
 261     newtype Reader a = Reader { runReader :: env -> a }
 262
 263 (For simplicity, I'm suppressing the fact that `Reader` is also parameterized
 264 on what type `env` is.) This creates a type wrapper around `env -> a`, so that Haskell will
 265 distinguish between values that have been specifically designated as
 266 being of type `Reader a`, and common-garden values of type `env -> a`.
 267
 268 To lift an aribtrary expression `E` of type `env -> a` into an `Reader a`,
 269 you do this:
 270
 271     Reader { runReader = E }
 272
 273 or use any of the following equivalent shorthands:
 274
 275     Reader (E)
 276     Reader $ E
 277
 278 To drop an expression `R` of type `Reader a` back into an `env -> a`, you do
 279 one of these:
 280
 281     runReader (R)
 282     runReader $ R
 283
 284 The `newtype` in the type declaration ensures that Haskell does this all
 285 efficiently: though it regards `E` and `R` as type-distinct, their underlying
 286 machine implementation is identical and doesn't need to be transformed when
 287 lifting/dropping from one type to the other.
 288
 289 Now, you _could_ also declare monads as record types in OCaml, too, _but_
 290 doing so would introduce an extra level of machine representation, and
 291 lifting/dropping from the one type to the other wouldn't be free like it is
 292 in Haskell. Also it would be syntactically more complex.
 293
 294 So Juli8 instead encapsulates the monadic values by making their implementation types "private" or abstract or opaque. We get the same result: OCaml won't let you freely interchange `'a Reader.t`s with `env -> 'a`s. The Juli8 internals can see that these are equivalent, but refuses to permit code outside of the library to rely on that assumption. Instead, you have to use the `run` operation (like Haskell's `runReader`) to convert the opaque monadic values to ones whose type is exposed to you.