July 2008


[Edit: My apologies to mfp of eigenclass; my original analysis was flawed. I've restructured this to serve as an introduction to difference lists.]

Recently there was a post on eigenclass that was picked up by programming.reddit wherein the author performed an analysis of the classic Haskell quicksort example.

Bowing to Lennart's biases I'll admit the Haskell quicksort is not exactly the same thing and refer to it as "quicksort" in somewhat patronizing quotes hereafter.

 
import Data.List (partition)
 
qsort :: Ord a => [a] -> [a]
qsort [] = []
qsort (a:as) = qsort ls ++ [a] ++ qsort rs where
    (ls,rs) = partition (< = a) as
 

I've slightly modified the above to use partition rather than use filter twice, but its the same thing just with a single traversal -- unfortunately my blog inserts a space in the <=.

Profiling the following will see $\mathcal{O}(n^2)$ performance.

 
reverse (a:as) = reverse as ++ [a]
reverse [] = []
 
 
T(0) = 1
T(n) = T(n - 1) + n - 1 + 1 = T (n - 1) + n
T(n) = O(n^2)
 

It is also fairly well known that you can convert the naive reverse to a tail recursive form and get a linear time list reversal.

 
reverse as = reverse' as [] where
    reverse' (a:as) xs = reverse' as (a:xs)
    reverse' [] xs = xs
 

Here we can clearly see that we pattern match once and cons once per time through reverse'.

 
T(0) = 2
T(n) = T(n - 1) + 2
T(n) = O(n)
 

One way to think about this implementation is to think about it in terms of a difference list. Don Stewart has a nice library for difference lists, which uses a newtype but I'll stick to a type here to avoid syntactic noise.

 
type DList a = [a] -> [a]
 

To see that connection, lets rewrite this point-free and highlight the result type by applying the above type alias.

 
reverse as = reverse' as [] where
    reverse' :: [a] -> ([a] -> [a])
    reverse' (a:as) = reverse' as . (a:)
    reverse' [] = id
 

With the above we can see that:

 
reverse' :: [a] -> DList a
 

Motivated by the above, we can see that a difference list is just a function that accepts a list which it will use as its tail. So a difference list is a slightly constrained type of function from [a] -> [a].

Now, if we look at the type of

 
singleton :: a -> DList a
singleton = (:)
 

we can see that (a:) is already difference list. However, [] is not.

We can represent the empty list as a difference list with:

 
nil :: DList a
nil = id
 

Now in this representation we can append difference lists by just using (.):

 
append :: DList a -> DList a -> DList a
append = (.)
 

and you can convert from a difference list to a list by just applying it to [].

So if these are so good, why don't we use them everywhere?

1. Well, the amortization schedule for the costs you incur while tearing down a difference list is different than for a traditional list, so you pay prices at different times.
2. The type is 'too large' in that it doesn't adequately capture the constraint that the function must use the list its given and must append it to its output and can't use the list in any other ways.
3. Difference lists do not generate thunks that remember their contents after being forced the first time. However, if all you are going to do when you are done is convert the overall result back to a normal list, then you can get that list to get the memoization-like thunking effect from the result list.

So with these in hand, we can easily see reverse' as taking a list and returning a difference list, and then we are just converting it back to a normal list using the fairly standard worker/wrapper pattern.

 
reverse as = reverse' as [] where
    reverse' :: [a] -> DList a
    reverse' (a:as) = reverse' as `append` singleton a
    reverse' [] = nil
 

Ok. So recalling qsort,

 
qsort [] = []
qsort (a:as) = qsort ls ++ [a] ++ qsort rs where
    (ls,rs) = partition (< =a) as
 

Lets apply the same transformation that we applied to reverse to the well known quicksort example, transforming its output to a difference list.

 
qsort' :: Ord a => [a] -> DList a
qsort' [] = nil
qsort' (a:as) = qsort' ls `append` singleton a `append` qsort' rs where
    (ls,rs) = partition (< =a) as
 

And after inlining the very succinct definitions and transforming the output back to a list we get a visibly similar worker/wrapper combo:

 
qsort :: Ord a => [a] -> [a]
qsort as = qsort' as [] where
    qsort' [] = id
    qsort' (a:as) = qsort' ls . (a:) . qsort' rs where
        (ls,rs) = partition (< =a) as
 

However, we now get better performance.

Performance

Green is the time taken by the difference list version. Blue is the traditional Haskell qsort.


[Link]

To embed the Haskell Café on a web page:

<iframe src='http://embed.lively.com/iframe?rid=-4485567674160322075'
    width='460' height='400'
    marginwidth='0' marginheight='0'
    frameborder='0' scrolling='no'>
</iframe>

Description
Anamorphisms are generalizations of Haskell's unfoldr. A anamorphism builds a data structure with an F-coalgebra by recursively unrolling a seed.

History
The name anamorphism comes from the Greek 'ανα-' meaning "upwards". [1,2]

Notation
An anamorphism for some F-coalgebra $(X,\psi)$ is denoted with "lenses" $[\hspace{-.2em}( \psi )\hspace{-.2em}]_F$. When the functor F can be determined unambiguously, it is usually written $[\hspace{-.2em}( \psi )\hspace{-.2em}]$ or ana $\psi$.

Implementation

 
ana :: Functor f => Coalgebra f a -> a -> FixF f
ana g = InF . fmap (ana g) . g
 

Alternate implementations

 
ana g = hylo InF g
 

Duality

An anamorphism is the categorical dual of a catamorphism.

Derivation
If $(\nu F,\mathrm{out}_F)$ is the final $F$-coalgebra for some endofunctor $F$ and $(X,\psi)$ is an $F$-coalgebra, then there is a unique $F$-coalgebra homomorphism from $(X,\psi)$ to $(\nu F,\mathrm{out}_F)$ which we denote $[\!( \psi )\!]_F$.

That is to say, the following diagram commutes:

\bfig \square/>`>`>`>/[X`F X`\nu F`F \nu F;\psi`{[}\!{(}\psi{)}\!{]}`F {[}\!{(}\psi{)}\!{]}`\mathrm{out}_F] \efig

Laws

Rule Haskell
ana-charn fmap x . psi = outF . x $\equiv$ x = ana psi
ana-self fmap (ana psi) . psi = outF . ana psi
ana-id id = ana outF
ana-uniq fmap x . psi = outF . x $\wedge$ fmap y . psi = outF . y => x = y
ana-fusion fmap x . phi = psi . x => ana phi = ana psi . x
ana-compose e :: f :~> g => ana (e . outF) . ana phi = ana (e . phi)

Examples

 
data StreamF a x = a :> x
type Stream a = FixF (StreamF a)
 
instance Functor (StreamF a) where
    fmap f (a :> as) = a :> f as
 
repeat :: a -> Stream a
repeat = ana psi where
    psi n = n :> n
 
from :: Num a => a -> Stream a
from = ana psi where
    psi n = n :> (n + 1)
 

Links
[Haddock] [Source] [Field Guide]

References

[1] E. Meijer, M. Fokkinga, R. Paterson. Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire. Proceedings, 5th ACM Conference on Functional Programming Languages and Computer Architecture.
[2] M. Fokkinga. Law and Order in Algorithmics. PhD Thesis. 1992

I'm off to MSFP'08 (which is colocated with ICALP) and should have a day or two spare on either side of the workshop to let me find stuff to do in Iceland and meet people.

I'd love to get a chance to meet folks in person.

If anyone else is planning on being in the area, please feel free to drop me a line.