In the previous articles in this
series, I covered the conceptual basis of monads, but the discussion was pretty
abstract. Now that (I hope) you have a pretty good idea of what monads are and
what they're for, it's time to get into the nittygritty details of how to
actually derive the definitions of specific monads. That means that we are
going to have to define the correct instances of the type class
Monad
for the various notions of computation we talked about
previously. In our derivations, we will use our understanding of how monadic
composition should work in each particular case to give us the definition of
monadic apply (the >>=
operator), and we'll use the monad
laws to give us the definition of return
.
The Maybe
monad
The Maybe
monad is often the first monad shown to new Haskell programmers
in tutorials, because it's very simple to use, to implement, and to
understand. To start with, let's look at the definition of the Maybe
data
type:
data Maybe a = Nothing  Just a
This states that Maybe
is a type constructor that acts on a particular type
a
to give a (concrete) data type. We usually describe this by saying that
Maybe
is a "polymorphic" data type, but the meaning is the same. So if a
was Int
, we'd have
data Maybe Int = Nothing  Just Int
except that we don't have to define this directly, as the one definition given above suffices for all types.
A value of type Maybe a
is either there or not there; if it's Nothing
, it's
"not there" and if it's Just x
for some value x
, it's "just" the value x
.
One way to think of this is as a container that either has zero or one elements.
(Remember that I said previously that monadic values were sometimes described,
misleadingly, as containers. This is a case in point.)
The Maybe
polymorphic type is useful because we can use it to model a kind
of "extended function" which can either produce a value as its output or fail
to produce any output (i.e. a function that can fail). This can be written
as follows:
f :: a > Maybe b
A function f
of this type takes in an input value of type a
, and either
returns Nothing
(which indicates failure) or a value Just x
where x
has
type b
. Functions like f
will be the ones that will be used in the Maybe
monad, and composing two functions of this kind will look like this:
f :: a > Maybe b  assume it's defined elsewhere
g :: b > Maybe c  assume it's defined elsewhere
h :: a > Maybe c  monadic composition of f and g
h = f >=> g  recall: >=> is the monadic composition operator
We said that all monads must be type constructors. Maybe
is a type
constructor, so it qualifies in that sense. But in order to make Maybe
into a monad, we have to make it an instance of the type class Monad
, which
means that we have to fill out this definition:
instance Monad Maybe where
(>>=) = { definition of >>= for Maybe }
return = { definition of return for Maybe }
How do we define (>>=)
and return
for Maybe
?
First, let's write a skeleton definition for the >>=
operator, covering the
two possible cases of a left operand of type Maybe a
:
Nothing >>= f = { to be defined }
Just x >>= f = { to be defined }
where x
has type a
. It would also be legal to write the lefthand side
of this definition as:
(>>=) Nothing f = { to be defined }
(>>=) (Just x) f = { to be defined }
but it looks better if we define it the way we use it, as an operator, and Haskell allows us to do this.
To complete this definition, consider how we want monadic composition to work
in the Maybe
monad. Let's take our example above with f
and g
composing monadically to make h
:
f :: a > Maybe b
g :: b > Maybe c
h :: a > Maybe c
h = f >=> g
If we pass an argument to f
and it returns Nothing
(i.e. it fails),
then what should h
return on the same output?
f x = Nothing
h x = (f >=> g) x = ???
It seems apparent that if f x
is Nothing
, h x
must also be Nothing
,
because if the first function (f
) in the definition of h
fails to produce
an output, then h
as a whole must also fail to produce an output. The only
way h
could produce an output is if f
produced an output (call it y
),
passed it to g
, and g y
also produced an output. If either f
or g
fails, then h
fails too i.e. h x
will evaluate to Nothing
.
Putting this into our definition of h
, we have:
h = f >=> g
h = \x > (f x >>= g)  equivalent (from the definition of >=>)
h x = f x >>= g  equivalent
 assume f x == Nothing
h x = (Nothing >>= g)
= Nothing
 therefore:
Nothing >>= g = Nothing
So we know how the (>>=)
operator will work with an input of Nothing
— it will just return Nothing
:
Nothing >>= f = Nothing
Just x >>= f = { to be defined }
Note that I changed g
to f
here, which is fine because the name of the
function doesn't matter in the definition. In fact, where possible we leave off
the name of the function entirely and use the _
wildcard operator as follows:
Nothing >>= _ = Nothing
We can't do this for the second equation because we'll need the function f
in
the definition.
Now let's look at the other case. If f x
does not fail, it will produce
an output of the form Just y
for some result value y
. We need to
"unpack" a value from Just y
to get the value y
, which we will give as
the argument to g
, and g y
will be what h x
evaluates to:
h = f >=> g
h = \x > (f x >>= g)  equivalent (from the definition of >=>)
h x = f x >>= g  equivalent
 assume f x == Just y
h x = (Just y >>= g)
= g y
This gives us the second part of our definition:
Nothing >>= f = Nothing
Just x >>= f = f x
Note that I changed y
to x
and g
to f
in the definition. Again, the
variable and function names don't matter as long as you're consistent.
That completes the definition of the >>=
operator for the Maybe
monad.
Now we need to define the return
function for this monad:
return x = ???
for an arbitrary value x
. What possibilities have we got? We could just
say that
return x = Nothing
for any x
. If this were the case, however, we would violate our monad laws:
return x >>= f == f x
Nothing >>= f == f x
Nothing == f x  WRONG!
Since we can assume that at least some f x
values are not Nothing
(for
instance, consider the monadic function f x = Just x
), this can't be
correct. The obvious alternative is:
return x = Just x
which obeys the monad laws:
return x >>= f
= (Just x) >>= f  definition of return for Maybe monad
= f x  definition of >>= for Maybe monad
 correct according to monad law 1
Just x >>= return
= return x  definition of >>= for Maybe monad
= Just x  definition of return for Maybe monad
 correct according to monad law 2
Nothing >>= return
= Nothing  definition of >>= for Maybe monad
 correct according to monad law 2
This works, so let's use it. The complete definition of the Maybe
monad is
thus:
instance Monad Maybe where
return x = Just x
Nothing >>= f = Nothing
Just x >>= f = f x
Woo hoo! We've just derived our first monad!
Just to be on the safe side, let's check that our definition obeys monad law 3, which is:
(mv >>= f) >>= g == mv >>= (\x > (f x >>= g))
First let's check that it's correct when mv
is Nothing
:
(Nothing >>= f) >>= g  left hand side
= Nothing >>= g  definition of >>=
= Nothing  definition of >>=
Nothing >>= (\x > (f x >>= g))  right hand side
= Nothing  definition of >>=
OK, that checks out. Now let's see if it works when mv
is Just v
for
some value v
:
((Just v) >>= f) >>= g  left hand side
= f v >>= g  definition of >>=
(Just v) >>= (\x > (f x >>= g))  right hand side
= (\x > (f x >>= g)) v  definition of >>=
= f v >>= g  normal function application (beta reduction)
so this checks out too. So it works! This is the correct definition of
the Maybe
monad! And the audience goes wild!
What's the significance of this? It means that we can compose a bunch of
monadic functions in the Maybe
monad easily. And why, you will certainly be
asking yourself, is that important? It's not hard to imagine having a bunch of
Maybe
monadic functions i.e. functions that can fail. Let's say that they
all have type Int > Maybe Int
. Here are three such functions:
f :: Int > Maybe Int
f x = if x `mod` 2 == 0 then Nothing else Just (2 * x)
g :: Int > Maybe Int
g x = if x `mod` 3 == 0 then Nothing else Just (3 * x)
h :: Int > Maybe Int
h x = if x `mod` 5 == 0 then Nothing else Just (5 * x)
We'd like to compose them to get a final function:
k :: Int > Maybe Int
which is the result of applying f
, then g
, then h
one after another,
with Nothing
returned if any of them failed. All this does is multiply the
input number by 30 unless the number is a multiple of 2, 3, or 5 (in which
case it fails i.e. returns Nothing
).
If you've followed the previous discussion it should be clear that you can
define k
using monadic composition as follows:
k = f >=> g >=> h
Or, if you want to use the >>=
operator instead:
k x = f x >>= g >>= h
Or, if you like the do
notation:
k x = do y < f x
z < g y
h z
Any way you slice it, though, it's pretty simple.
Now, it's perfectly possible to define h
without using monads at all. It
looks like this:
k x = case f x of
Nothing > Nothing
Just y > case g y of
Nothing > Nothing
Just z > h z
This shows why the Maybe
monad is important: it drastically cuts down on the
boilerplate code we have to write to chain Maybe
producing functions together.
Imagine how grungy the nonmonadic code would be if there were ten Maybe
monadic functions in a row being chained together — it would be indented
so far to the left that you wouldn't be able to read it, and the overall
structure of the computation would be lost in a maze of case
statements. But
with monads, you could write it like this:
f11 = f1 >=> f2 >=> f3 >=> f4 >=> f5 >=> f6 >=> f7 >=> f8 >=> f9 >=> f10
or (using >>=
):
f11 x = f1 x >>= f2 >>= f3 >>= f4 >>= f5 >>= f6 >>= f7 >>= f8 >>= f9 >>= f10
So monads have made composition of monadic functions just as easy as composition of regular (nonmonadic) functions is.
The Maybe
monad is very useful for illustrating basic concepts, but it may
be confusing in one sense: many people mistakenly believe that the only
reason for having monads in Haskell is to handle nonfunctional computations
i.e. ones that do (file or terminal) I/O, alter global variables, etc. And
yet, here I showed you a monadic computation which can be done perfectly well
without monads. In this case, monads are not essential; they're just very
convenient. And that's why I said that even though the original reason for
adding monads to Haskell had to do with dealing with inherently
nonfunctional kinds of computations (like computations involving I/O), they
turned out to have a far greater applicability. That's why they're neat.
Now, on to our next monad.
The list monad
If you liked the Maybe
monad, you're going to love the list monad ;) We're
going to be filling in this instance definition:
instance Monad [] where
(>>=) = { definition of >>= for lists }
return = { definition of return for lists }
Note that we use the empty list []
to represent the list type constructor.
This is a bit of a hack (lists have special syntactic support in Haskell).
Whatever.
As with all monads, the first task is to be clear about what the monad's
monadic function represents. For the list monad, a monadic function f
looks like this:
f :: a > [b]
(where, of course, the notation [b]
means "list of b
"). Recall that the
generic definition of a monadic function is this:
f :: a > m b
for some monad m
which has to be a type constructor. Lists are a plausible
candidate for a monad because "list of" is a type constructor (even though
the syntax is hardwired into Haskell); if we wanted to, we could even define
a list datatype ourselves:
data List a = Nil  Cons a (List a)
and then the monadic function would look like this:
f :: a > List b
We'll stick to the standard list syntax from now on.
What do functions of this sort represent? The normal way of thinking about them
is that they take in an input value of type a
, and produce a bunch of output
values of type b
, collected in one convenient container (the list). (Here
again, we have a monad that seems to be a container.) A different way of
thinking about functions of this type is that they represent functions with
multiple return values i.e. functions which can somehow return a bunch of
different values "all at once". (I don't want to say "in parallel" because that
implies simultaneous processing which isn't going on here.) The multiple return
values are just the elements of the list. This turns out to be a useful
perspective when you have multiple functions of this type, for instance:
f :: Int > [Int]
g :: Int > [Int]
Here, f
and g
both take in a single Int
as an argument and return a bunch
of Int
s as results. What if we wanted to take every result from f
and apply
g
individually to each result, collecting all the results together? It would
be great if we could do this directly, without having to unpack every element
from the list of results that f
returns and apply g
to each one. This is
what the list monad will allow us to do.
Let's flesh this out a bit with some example functions:
f :: Int > [Int]
f x = [x1, x, x+1]
g :: Int > [Int]
g x = [x, x]
How would we "compose" these two functions? f x
returns a list, so to apply
g
to each element of the list we will need the map
function:
f 10 > [9, 10, 11]
map g (f 10) > [[9, 9], [10, 10], [11, 11]]
This new result may be interesting, but it can't be the composition of f
and
g
because it doesn't have the same type (it's a list of lists of Int
s, not a
list of Int
s). To turn it into a simple list of Int
s, we can flatten the
list of lists using the concat
function (which just concatenates a list of
lists into a single list):
 N.B. concat has the type [[a]] > [a]
concat (map g (f 10)) > [9, 9, 10, 10, 11, 11]
What this result represents is the collection of all the results obtained by
applying f
to an integer input and then applying g
to one of the results.
If you think of f
and g
as functions which produce multiple results "all at
once", then this result is just the collection of all possible results of
applying f
and then g
. Diagrammatically, we can represent this as:
g  9
 9 > 
  9

f  g  10
10 >  10 > 
  10

 g  11
 11 > 
 11
This shows that the composition of f
and g
is the collection of all paths
through the f
and g
functions.
Interestingly enough, we've just determined what the >>=
operator has to be
for the list monad! It's defined as follows:
 mv :: [a]
 g :: a > [b]
mv >>= g = concat (map g mv)
where mv
is a monadic value in the list monad (which is just a list of values
of type a
). In the previous example, mv
is just the result of evaluating f
10
and g
is as it was before. This definition even works if mv
is the
empty list []
, since mapping a function over the empty list just returns the
empty list, and concat
applied to the empty list returns the empty list as
well. So the definition of the >>=
operator for this monad is very simple.
[Note for GHC fans: The GHC Haskell compiler actually implements >>=
for the
list monad in a slightly different way which is (I believe) more efficient but
which gives the same results.]
How would we define return
in this monad? Let's think of the list monadic
value as an "action" returning various values. Let's also recall that return
should be the equivalent of the identity function in any given monad. What
would be the equivalent of the identity function in the list monad? It would
take an input value and return an "action" which simply returns that value when
"executed". So we know that return
can't simply return the empty list, for
instance. A reasonable possibility is to define return
as follows:
return :: a > [b]
return x = [x]
In other words, return
just creates a list with a single value. Let's see if
this is plausible by seeing if it obeys the monad laws:
 f :: a > [b]
 x :: a
return x >>= f = concat (map f (return x))  definition of >>=
= concat (map f [x])  definition of return
= concat [f x]  definition of map
= f x  definition of concat
 correct according to monad law 1
 mv :: [a]
mv >>= return = concat (map return mv)  definition of >>=
= concat (map (\x > [x]) mv)  definition of return
 Two cases:
 Case 1: mv == []
= concat (map (\x > [x]) [])  definition of mv
= concat []  definition of map
= []  definition of concat
= mv  definition of mv
 Case 2: mv == [v1, v2, ...]
= concat (map (\x > [x]) [v1, v2, ...])  definition of mv
= concat [[v1], [v2], ...]  definition of map
= [v1, v2, ...]  definition of concat
= mv  definition of mv
 correct according to monad law 2
OK, this works. You might want to try other possible definitions for return
in the list monad (e.g. where return
returns a list which contains 0, 2, 3,
or an infinite number of copies of its argument), and find out why they violate
the monad laws. This is a good way to get practice using the monad laws.
To prove that our list monad is really a monad, we still have to show that monad law 3 is obeyed. This will be a lot harder, but let's do it anyway. It'll turn out to be easier to do if we use the nice version of monad law 3 (the one involving monadic composition). To start, we need to define monadic composition in the list monad:
 Monad law 3 (nice version):
(f >=> g) >=> h = f >=> (g >=> h)
 By definition:
f >=> g = \x > f x >>= g
 In the list monad, using the definition of >>= for lists:
f >=> g = \x > concat (map g (f x))
 Using the composition operator (.), this can be written as:
f >=> g = concat . map g . f
In addition, I'll be using several identities involving map
and concat
applied to lists. You should just take these on faith for now, though I'll show
how to derive them below.
 equation 1:
map (f . g) = map f . map g
 equation 2:
map f . concat = concat . map (map f)
 equation 3:
concat . concat = concat . map concat
As I mentioned above, .
refers to the (pure) function composition operator.
It has lower precedence than function application (via concatenation), so an
expression like map f . map g
actually means (map f) . (map g)
. Haskell
programmers usually prefer to leave out unnecessary parentheses when possible.
It's also important to note that e.g. map f
is a function even though map
normally takes two arguments (a function and a list to map the function over).
If you recall what I said about currying previously, then you'll understand that
map f
is a function which takes a list as its only argument and returns a new
list, which what you get when you apply the function f
to all the elements of
the original list and collect the results into a new list. We'll be using
currying a lot below.
Given all that, here's the derivation:
(f >=> g) >=> h
= (concat . map g . f) >=> h  definition of >=>
= concat . map h . (concat . map g . f)  definition of >=>
= concat . map h . concat . map g . f  remove unnecessary parentheses
f >=> (g >=> h)
= f >=> (concat . map h . g)  definition of >=>
= concat . map (concat . map h . g) . f  definition of >=>
= concat . map ((concat . map h) . g) . f  equivalent
= concat . (map (concat . map h)) . (map g) . f  equation 1
= concat . map (concat . map h) . map g . f  remove unnecessary parentheses
= concat . map concat . map (map h) . map g . f  equation 1
So we need to show that:
concat . map h . concat . map g . f = concat . map concat . map (map h) . map g . f
which will be true if we can show that:
concat . map h . concat = concat . map concat . map (map h)
Let's do that:
 add parentheses for clarity:
concat . (map h . concat) = concat . map concat . map (map h)
 use equation 2:
concat . concat . map (map h) = concat . map concat . map (map h)
 add parentheses for clarity:
(concat . concat) . map (map h) = concat . map concat . map (map h)
 use equation 3:
concat . map concat . map (map h) = concat . map concat . map (map h)
and we're done. Whew! In fact, Haskell programmers very rarely do derivations of this kind, but they are necessary to show that a purported monad really is a monad.
ASIDE: Deriving the
map
/concat
identities (equations 1, 2, and 3)Preliminaries:
To prove these identities, we will need to first prove a few simpler ones (math is hard!). They are:
 Equation 4: concat (x:xs) = x ++ concat xs  Equation 5: concat (x ++ y) = concat x ++ concat y  Equation 6: map f (x ++ y) = map f x ++ map f y
Equation 4 follows from the definition of
concat
. Equation 5 is easily proved by induction on the length ofx
and using equation 4: base case: x is an empty list concat ([] ++ y) = concat [] ++ concat y concat y = [] ++ concat y  definition of concat [] concat y = concat y  definition of ++  OK  inductive case: x is nonempty; call its parts x1 (head) and xs (tail) concat ((x1:xs) ++ y) = concat (x1 : (xs ++ y))  definition of ++ = x1 ++ concat (xs ++ y)  equation 4 = x1 ++ concat xs ++ concat y  inductive hypothesis concat (x1:xs) ++ concat y = x1 ++ concat xs ++ concat y  equation 4  OK, so Q.E.D.
Equation 6 is proved similarly:
 base case: x is an empty list map f ([] ++ y) = map f [] ++ map f y map f y = [] ++ map f y map f y = map f y  OK  inductive case: x is nonempty; call its parts x1 (head) and xs (tail) map f (x ++ y) = map f ((x1:xs) ++ y) = map f (x1 : (xs ++ y))  definition of ++ = f x1 : map f (xs ++ y)  definition of map = f x1 : (map f xs ++ map f y)  inductive hypothesis = (f x1 : map f xs) ++ map f y  definition of ++ = map f (x1:xs) ++ map f y  definition of map = map f x ++ map f y  definition of x  OK, so Q.E.D.
With that out of the way, let's get to the proofs of equations 1, 2, and 3.
Equation 1:
map (f . g) = map f . map g
Use induction on the length of the implicit list argument on both sides, as well as the definition of
map
: base case: empty list map (f . g) [] = [] (map f . map g) [] = map f (map g []) = map f [] = []  OK  inductive case: nonempty list map (f . g) (x:xs) = ((f . g) x) : (map (f . g) xs)  definition of map = (f (g x)) : (map (f . g) xs)  definition of . (map f . map g) (x:xs) = map f (map g (x:xs))  definition of . = map f ((g x) : (map g xs))  definition of map = (f (g x)) : (map f (map g xs))  definition of map = (f (g x)) : ((map f . map g) xs)  definition of . = (f (g x)) : (map (f . g) xs)  inductive hypothesis  OK, so Q.E.D.
Equation 2:
map f . concat = concat . map (map f)
Use induction on the length of the list argument:
 base case: empty list (map f . concat) [] = map f (concat []) = map f [] = [] (concat . map (map f)) [] = concat (map (map f) []) = concat [] = []  OK  inductive case: nonempty list (map f . concat) (x:xs) = map f (concat (x:xs))  definition of . = map f (x ++ concat xs)  equation 4 = map f x ++ (map f (concat xs))  equation 6 = map f x ++ ((map f . concat) xs)  definition of . = map f x ++ ((concat . map (map f)) xs)  inductive hypothesis = map f x ++ concat (map (map f) xs)  definition of . (concat . map (map f)) (x:xs) = concat (map (map f) (x:xs))  definition of . = concat (map f x : map (map f) xs)  definition of map = map f x ++ concat (map (map f) xs)  equation 4  OK, so Q.E.D.
Equation 3:
concat . concat = concat . map concat
As usual, use induction on the length of the list argument:
 base case: empty list (concat . concat) [] = concat (concat []) = concat [] = [] (concat . map concat) [] = concat (map concat []) = concat [] = []  OK  inductive case: nonempty list (concat . concat) (x:xs) = concat (concat (x:xs))  definition of . = concat (x ++ concat xs)  equation 4 = concat x ++ concat (concat xs)  equation 5 (concat . map concat) (x:xs) = concat (map concat (x:xs))  definition of . = concat (concat x : map concat xs)  definition of map = concat x ++ concat (map concat xs)  equation 4 = concat x ++ (concat . map concat) xs  definition of . = concat x ++ (concat . concat) xs  inductive hypothesis = concat x ++ concat (concat xs)  definition of .  OK, so Q.E.D.
By now, I hope you're fully convinced that the list monad is in fact a monad ;)
A much more interesting question, of course, is this: what can we do with the list monad that was hard to do before? Here's a simple example: find all pairs of numbers between 1 and 6 that sum to the number 7 (the numbers might represent e.g. dice rolls). With the list monad, this is trivial:
 Using donotation:
do n1 < [1..6]
n2 < [1..6]
if n1 + n2 == 7 then return (n1, n2) else []
 Result: [(1,6),(2,5),(3,4),(4,3),(5,2),(6,1)]
This can also be written without the do
notation, though it's less clear:
[1..6] >>= \n1 >
[1..6] >>= \n2 >
if n1 + n2 == 7 then return (n1, n2) else []
How does this work? You should really sit down and work this out for yourself
given the definitions of >>=
and return
for lists, but here's a handwavy
explanation. Note that [1..6]
is a monadic value in the list monad, so it
delivers all of its values to n1
(one at a time, of course), and then again
to n2
, and then if the sum is correct all the pairs (n1, n2)
will be
returned. So we're doing a computation on all elements of a list as easily as
if it were computations on only single elements. That's what the list monad
buys you.
If you've done any significant amount of Haskell programming, alarm bells are going off in your head right about now. "But wait!" I hear you cry. "Can't you just use a list comprehension to do this?" Indeed you can, and it looks like this:
[(n1, n2)  n1 < [1..6], n2 < [1..6], n1 + n2 == 7]
The list monad can in fact do anything that list comprehensions can do. Which syntax one prefers is a matter of taste and may also depend on the particular problem. In fact, Phil Walder, in the paper Comprehending Monads (there's a pun in that title, as there are in most of Phil's paper titles) suggested that the list comprehension syntax be extended to arbitrary monads. That proposal was eventually rejected in favor of the current notation.
The list monad is more than just an alternate version of list comprehensions,
though. For one thing, there are many very generic functions that can work on
instances of any monad; having a list monad means that those functions can work
on lists as well. Furthermore, there is an extension to the Monad
type class
called MonadPlus
which adds extra functionality to monads (specifically, it
defines a "zero" element for a monad as well as a way of "adding" two monadic
values). Lists turn out to be an instance of MonadPlus
as well as of Monad
,
and this means that generic functions that work on all instances of MonadPlus
will also work on lists. (For instance, there is a generalization of the
concat
function called msum
which works for any instance of MonadPlus
,
including lists.) It's better to have generic functions that can work on large
numbers of datatypes than to have to define new functions for each particular
datatype, so this is a clear win.
Next time
In the next installment I'll look at monads to deal with error handling.