implementing_trees.mdwn

   1 #Implementing trees#
   2
   3 In [[Assignment3]] we proposed a very ad-hoc-ish implementation of
   4 trees.  It had the virtue of constructing trees entirely out of lists,
   5 which meant that there was no need to define any special
   6 tree-construction functions.
   7
   8 Think about how you'd implement them in a more principled way. You could
   9 use any of the version 1 -- version 5 implementation of lists as a model.
  10
  11 To keep things simple, we'll stick to binary trees. A node will either be a
  12 *leaf* of the tree, or it will have exactly two children.
  13
  14 There are two kinds of trees to think about. In one sort of tree, it's only
  15 the tree's leaves that are labeled:
  16
  17                 .
  18            / \
  19           .   3
  20          / \
  21         1   2
  22
  23 The inner, non-leaf nodes of the tree may have associated values. But if so,
  24 what values they are will be determinable from the structure of the tree and the
  25 values of the node's left and right children. So the inner nodes don't need
  26 their own independent labels.
  27
  28 In another sort of tree, the tree's inner nodes are also labeled:
  29
  30                 4
  31            / \
  32           2   5
  33          / \
  34         1   3
  35
  36 When you want to efficiently arrange an ordered collection, so that it's
  37 easy to do a binary search through it, this is the way you usually structure
  38 your data.
  39
  40 These latter sorts of trees can helpfully be thought of as ones where
  41 *only* the inner nodes are labeled. Leaves can be thought of as special,
  42 dead-end branches with no label:
  43
  44                    .4.
  45                   /   \
  46                  2     5
  47                 / \   / \
  48            1   3  x x
  49           / \ / \
  50          x  x x  x
  51
  52 In our earlier discussion of lists, we said they could be thought of as
  53 data structures of the form:
  54
  55         Empty_list | Non_empty_list (its_head, its_tail)
  56
  57 And that could in turn be implemented in v2 form as:
  58
  59         the_list (\head tail. non_empty_handler) empty_handler
  60
  61 Similarly, the leaf-labeled tree:
  62
  63                 .
  64            / \
  65           .   3
  66          / \
  67         1   2
  68
  69 can be thought of as a data structure of the form:
  70
  71         Leaf (its_label) | Non_leaf (its_left_subtree, its_right_subtree)
  72
  73 and that could be implemented in v2 form as:
  74
  75         the_tree (\left right. non_leaf_handler) (\label. leaf_handler)
  76
  77 And the node-labeled tree:
  78
  79                    .4.
  80                   /   \
  81                  2     5
  82                 / \   / \
  83            1   3  x x
  84           / \ / \
  85          x  x x  x
  86
  87 can be thought of as a data structure of the form:
  88
  89         Leaf | Non_leaf (its_left_subtree, its_label, its_right_subtree)
  90
  91 and that could be implemented in v2 form as:
  92
  93         the_tree (\left label right. non_leaf_handler) leaf_result
  94
  95
  96 What would correspond to "folding" a function `f` and base value `z` over a
  97 tree? Well, if it's an empty tree:
  98
  99         x
 100
 101 we should presumably get back `z`. And if it's a simple, non-empty tree:
 102
 103           1
 104          / \
 105         x   x
 106
 107 we should expect something like `f z 1 z`, or `f <result of folding f and z
 108 over left subtree> label_of_this_node <result of folding f and z over right
 109 subtree>`. (It's not important what order we say `f` has to take its arguments
 110 in.)
 111
 112 A v3-style implementation of node-labeled trees, then, might be:
 113
 114         let empty_tree = \f z. z  in
 115         let make_tree = \left label right. \f z. f (left f z) label (right f z)  in
 116         ...
 117
 118 Think about how you might implement other tree operations, such as getting
 119 the label of the root (topmost node) of a tree; extracting the left subtree of
 120 a node; and so on.
 121
 122 Think about different ways you might implement leaf-labeled trees.
 123
 124 If you had one tree and wanted to make a larger tree out of it, adding in a
 125 new element, how would you do that?
 126
 127 When using trees to represent linguistic structures, one doesn't have
 128 latitude about *how* to build a larger tree. The linguistic structure you're
 129 trying to represent will determine where the new element should be placed, and
 130 where the previous tree should be placed.
 131
 132 However, when using trees as a computational tool, one usually does have
 133 latitude about how to structure a larger tree---in the same way that we had the
 134 freedom to implement our [sets](/week4/#index9h1) with lists whose members were
 135 just appended in the order we built the set up, or instead with lists whose
 136 members were ordered numerically.
 137
 138 When building a new tree, one strategy for where to put the new element and
 139 where to put the existing tree would be to always lean towards a certain side.
 140 For instance, to add the element `2` to the tree:
 141
 142           1
 143          / \
 144         x   x
 145
 146 we might construct the following tree:
 147
 148           1
 149          / \
 150         x   2
 151            / \
 152           x   x
 153
 154 or perhaps we'd do it like this instead:
 155
 156           2
 157          / \
 158         x   1
 159            / \
 160           x   x
 161
 162 However, if we always leaned to the right side in this way, then the tree
 163 would get deeper and deeper on that side, but never on the left:
 164
 165           1
 166          / \
 167         x   2
 168            / \
 169           x   3
 170                  / \
 171                 x   4
 172                    / \
 173                   x   5
 174                          / \
 175                         x   x
 176
 177 and that wouldn't be so useful if you were using the tree as an arrangement
 178 to enable *binary searches* over the elements it holds. For that, you'd prefer
 179 the tree to be relatively "balanced", like this:
 180
 181                    .4.
 182                   /   \
 183                  2     5
 184                 / \   / \
 185            1   3  x x
 186           / \ / \
 187          x  x x  x
 188
 189 Do you have any ideas about how you might efficiently keep the new trees
 190 you're building pretty "balanced" in this way?
 191
 192 This is a large topic in computer science. There's no need for you to learn
 193 the various strategies that they've developed for doing this. But
 194 thinking in broad brush-strokes about what strategies might be promising will
 195 help strengthen your understanding of trees, and useful ways to implement them
 196 in a purely functional setting like the lambda calculus.
 197