implementing_trees.mdwn

   1 #Implementing trees#
   2
   3 In [[Assignment3]] we proposed a very ad-hoc-ish implementation of trees.
   4
   5 Think about how you'd implement them in a more principled way. You could
   6 use any of the version 1 -- version 5 implementation of lists as a model.
   7
   8 To keep things simple, we'll stick to binary trees. A node will either be a
   9 *leaf* of the tree, or it will have exactly two children.
  10
  11 There are two kinds of trees to think about. In one sort of tree, it's only
  12 the tree's leaves that are labeled:
  13
  14                 .
  15            / \
  16           .   3
  17          / \
  18         1   2
  19
  20 Linguists often use trees of this sort. The inner, non-leaf nodes of the
  21 tree do have associated values. But what values they are can be determined from
  22 the structure of the tree and the values of the node's left and right children.
  23 So the inner node doesn't need its own independent label.
  24
  25 In another sort of tree, the tree's inner nodes are also labeled:
  26
  27                 4
  28            / \
  29           2   5
  30          / \
  31         1   3
  32
  33 When you want to efficiently arrange an ordered collection, so that it's
  34 easy to do a binary search through it, this is the way you usually structure
  35 your data.
  36
  37 These latter sorts of trees can helpfully be thought of as ones where
  38 *only* the inner nodes are labeled. Leaves can be thought of as special,
  39 dead-end branches with no label:
  40
  41                    .4.
  42                   /   \
  43                  2     5
  44                 / \   / \
  45            1   3  x x
  46           / \ / \
  47          x  x x  x
  48
  49 In our earlier discussion of lists, we said they could be thought of as
  50 data structures of the form:
  51
  52         Empty_list | Non_empty_list (its_head, its_tail)
  53
  54 And that could in turn be implemented in v2 form as:
  55
  56         the_list (\head tail. non_empty_handler) empty_handler
  57
  58 Similarly, the leaf-labeled tree:
  59
  60                 .
  61            / \
  62           .   3
  63          / \
  64         1   2
  65
  66 can be thought of as a data structure of the form:
  67
  68         Leaf (its_label) | Non_leaf (its_left_subtree, its_right_subtree)
  69
  70 and that could be implemented in v2 form as:
  71
  72         the_tree (\left right. non_leaf_handler) (\label. leaf_handler)
  73
  74 And the node-labeled tree:
  75
  76                    .4.
  77                   /   \
  78                  2     5
  79                 / \   / \
  80            1   3  x x
  81           / \ / \
  82          x  x x  x
  83
  84 can be thought of as a data structure of the form:
  85
  86         Leaf | Non_leaf (its_left_subtree, its_label, its_right_subtree)
  87
  88 and that could be implemented in v2 form as:
  89
  90         the_tree (\left label right. non_leaf_handler) leaf_result
  91
  92
  93 What would correspond to "folding" a function `f` and base value `z` over a
  94 tree? Well, if it's an empty tree:
  95
  96         x
  97
  98 we should presumably get back `z`. And if it's a simple, non-empty tree:
  99
 100           1
 101          / \
 102         x   x
 103
 104 we should expect something like `f z 1 z`, or `f <result of folding f and z
 105 over left subtree> label_of_this_node <result of folding f and z over right
 106 subtree>`. (It's not important what order we say `f` has to take its arguments
 107 in.)
 108
 109 A v3-style implementation of node-labeled trees, then, might be:
 110
 111         let empty_tree = \f z. z  in
 112         let make_tree = \left label right. \f z. f (left f z) label (right f z)  in
 113         ...
 114
 115 Think about how you might implement other tree operations, such as getting
 116 the label of the root (topmost node) of a tree; extracting the left subtree of
 117 a node; and so on.
 118
 119 Think about different ways you might implement leaf-labeled trees.
 120
 121 If you had one tree and wanted to make a larger tree out of it, adding in a
 122 new element, how would you do that?
 123
 124 When using trees to represent linguistic structures, one doesn't have
 125 latitude about *how* to build a larger tree. The linguistic structure you're
 126 trying to represent will determine where the new element should be placed, and
 127 where the previous tree should be placed.
 128
 129 However, when using trees as a computational tool, one usually does have
 130 latitude about how to structure a larger tree---in the same way that we had the
 131 freedom to implement our sets with lists whose members were just appended in
 132 the order we built the set up, or instead with lists whose members were ordered
 133 numerically.
 134
 135 When building a new tree, one strategy for where to put the new element and
 136 where to put the existing tree would be to always lean towards a certain side.
 137 For instance, to add the element `2` to the tree:
 138
 139           1
 140          / \
 141         x   x
 142
 143 we might construct the following tree:
 144
 145           1
 146          / \
 147         x   2
 148            / \
 149           x   x
 150
 151 or perhaps we'd do it like this instead:
 152
 153           2
 154          / \
 155         x   1
 156            / \
 157           x   x
 158
 159 However, if we always leaned to the right side in this way, then the tree
 160 would get deeper and deeper on that side, but never on the left:
 161
 162           1
 163          / \
 164         x   2
 165            / \
 166           x   3
 167                  / \
 168                 x   4
 169                    / \
 170                   x   5
 171                          / \
 172                         x   x
 173
 174 and that wouldn't be so useful if you were using the tree as an arrangement
 175 to enable *binary searches* over the elements it holds. For that, you'd prefer
 176 the tree to be relatively "balanced", like this:
 177
 178                    .4.
 179                   /   \
 180                  2     5
 181                 / \   / \
 182            1   3  x x
 183           / \ / \
 184          x  x x  x
 185
 186 Do you have any ideas about how you might efficiently keep the new trees
 187 you're building pretty "balanced" in this way?
 188
 189 This is a large topic in computer science. There's no need for you to learn
 190 the various strategies that they've developed for doing this. But
 191 thinking in broad brush-strokes about what strategies might be promising will
 192 help strengthen your understanding of trees, and useful ways to implement them
 193 in a purely functional setting like the lambda calculus.
 194