Let’s implement a binary tree in Haskell.
|
|
The above defines a Tree
that takes a value of type a
. Because we have not explicitly specified any datatype (such as an Int
or a [Char]
, for instance) and instead defined it in terms of a
which can stand for any type, this tree is capable of handling values of any datatype.
It has 2 constructors:
- Constructor for a tree with just a
Leaf
. - Constructor for a tree with two subtrees, each of which maybe
- another tree (with further subtrees), or
- a leaf.
We’re deriving Show
(line 3) so that we can pretty-print our binary-tree.
It’s evident that defining a tree in Haskell is way more easier than doing the same in Java, C++, or even Python.
Let’s go ahead and create elements using this tree. Save the code above into a file (say my_tree.hs
), and load it using the ghci interpreter.
$ ghci my_tree.hs
GHCi, version 8.0.1: http://www.haskell.org/ghc/ :? for help
[1 of 1] Compiling Main ( my_tree.hs, interpreted )
Ok, modules loaded: Main.
So far so good. Let’s create two trees (using the Leaf
value constructor) and see their type.
*Main> a = Leaf "cat"
*Main> b = Leaf "rabbit"
*Main> :type a
a :: Tree [Char]
*Main> :type b
b :: Tree [Char]
Both a
and b
are infact Tree
that stores [Char]
(a.k.a String
).
Let’s create a new Node
that contains a
and b
defined above as subtrees.
*Main> n = Node a b
*Main> :t n
n :: Tree [Char]
*Main> n
Node (Leaf "cat") (Leaf "rabbit")
Creating a Node
was simple – we passed in leaves a
and b
we created earlier into the Node
constructor.
The last line in the snippet displays the contents of the tree n
– it has two subtrees, each of which is a Leaf
. We were able to display the contents in this way because we had derived the Show
typeclass.
Note that n
is a tree that stores data of type [Char]
because all its subtrees store data of type [Char]
. If we had instead tried to create a tree with different datatypes, ghci would complain, as the following example demonstrates:
*Main> p_num = Leaf 20
*Main> q_str = Leaf "Linda"
*Main> :t p_num
p_num :: Num a => Tree a
*Main> :t q_str
q_str :: Tree [Char]
*Main> r_bad = Node p_num q_str
<interactive>:15:14: error:
• No instance for (Num [Char]) arising from a use of ‘p_num’
• In the first argument of ‘Node’, namely ‘p_num’
In the expression: Node p_num q_str
In an equation for ‘r_bad’: r_bad = Node p_num q_str
The problem was that p_num
and q_str
stores values of different types, and we can’t create a Tree
that combines subtrees which store values of two different types.
Tree Map
Let’s create a treeMap
function that maps over a tree, converting values in the leaves to another value.
treeMap
should take as arguments:
- A function that converts a value of type
a
(ie., a value stored in aLeaf
or aNode
) to a value of typeb
;a
andb
need not necessarily be the same. - A
Tree
to work on.
Add the following code to my_tree.hs
we were working on earlier.
{% highlight haskell linenos %} treeMap :: (a -> b) -> (Tree a) -> (Tree b) treeMap f (Leaf a) = Leaf (f a) treeMap f (Node t1 t2) = Node (treeMap f t1) (treeMap f t2) {% endhighlight %}
Line #2 handles the case when the Tree
is a Leaf
, and line #3 handles the case when the Tree
is a Node
.
First up, reload my_tree.hs
into the interpreter.
*Main> :r
[1 of 1] Compiling Main ( my_tree.hs, interpreted )
Ok, modules loaded: Main.
Let’s add some more nodes to the tree we created earlier.
*Main> m = Leaf "horse"
*Main> p = Node m n
*Main> p
Node (Leaf "horse") (Node (Leaf "cat") (Leaf "rabbit"))
Now, let’s run treeMap
on the p
to get a new tree with the length of each of the leaves.
*Main> treeMap length p
Node (Leaf 5) (Node (Leaf 3) (Leaf 6))
Leaves of the new tree contains the string length of values in the original tree p
. The structure of the tree remains unchanged.
Since the leaves of the resultant tree stores Int
(because length
returns values of type Int
), we will see this change reflected when we examine its type.
*Main> :t (treeMap length p)
(treeMap length p) :: Tree Int
Tree Fold
Let’s now implement a function that folds a tree into a single value.
treeFold
should take as arguments:
- Function that accepts a value of type
a
and returns a value of typeb
, to operate onLeaf
.(a -> b)
- Function that accepts two values of type
b
, and returns a third value of typeb
. Operates on the values the first function returns (after processingLeaf
).(b -> b -> b)
- A
Tree
to work on.
Finally, treeFold
would return a value of type b
.
Append the following code to my_tree.hs
:
{% highlight haskell linenos %}
treeFold :: (b -> b -> b) -> (a -> b) -> (Tree a) -> b
treeFold fnode fleaf (Leaf n) = fleaf n
treeFold fnode fleaf (Node t1 t2) = fnode (treeFold fnode fleaf t1) (treeFold fnode fleaf t2)
{% endhighlight %}
Reload my_tree.hs
into the interpreter.
*Main> :r
[1 of 1] Compiling Main ( my_tree.hs, interpreted )
Ok, modules loaded: Main.
Now, we’ll use treeFold
to add the lengths of all elements stored in the leaves of tree p
, which we defined earlier.
*Main> p
Node (Leaf "horse") (Node (Leaf "cat") (Leaf "rabbit"))
*Main> treeFold (+) length p
14
Simple as that!
See the full my_tree.hs
below: