Wed 16 Jul 2008
A Sort of Difference
Posted by Edward Kmett under Haskell , Data Structures , Algorithms[12] Comments
[Edit: My apologies to mfp of eigenclass; my original analysis was flawed. I've restructured this to serve as an introduction to difference lists.]
Recently there was a post on eigenclass that was picked up by programming.reddit wherein the author performed an analysis of the classic Haskell quicksort example.
Bowing to Lennart's biases I'll admit the Haskell quicksort is not exactly the same thing and refer to it as "quicksort" in somewhat patronizing quotes hereafter.
import Data.List (partition) qsort :: Ord a => [a] -> [a] qsort [] = [] qsort (a:as) = qsort ls ++ [a] ++ qsort rs where (ls,rs) = partition (< = a) as
I've slightly modified the above to use partition rather than use filter twice, but its the same thing just with a single traversal -- unfortunately my blog inserts a space in the <=.
Profiling the following will see
performance.
reverse (a:as) = reverse as ++ [a] reverse [] = []
T(0) = 1 T(n) = T(n - 1) + n - 1 + 1 = T (n - 1) + n T(n) = O(n^2)
It is also fairly well known that you can convert the naive reverse to a tail recursive form and get a linear time list reversal.
reverse as = reverse' as [] where reverse' (a:as) xs = reverse' as (a:xs) reverse' [] xs = xs
Here we can clearly see that we pattern match once and cons once per time through reverse'.
T(0) = 2 T(n) = T(n - 1) + 2 T(n) = O(n)
One way to think about this implementation is to think about it in terms of a difference list. Don Stewart has a nice library for difference lists, which uses a newtype but I'll stick to a type here to avoid syntactic noise.
type DList a = [a] -> [a]
To see that connection, lets rewrite this point-free and highlight the result type by applying the above type alias.
reverse as = reverse' as [] where reverse' :: [a] -> ([a] -> [a]) reverse' (a:as) = reverse' as . (a:) reverse' [] = id
With the above we can see that:
reverse' :: [a] -> DList a
Motivated by the above, we can see that a difference list is just a function that accepts a list which it will use as its tail. So a difference list is a slightly constrained type of function from [a] -> [a].
Now, if we look at the type of
singleton :: a -> DList a singleton = (:)
we can see that (a:) is already difference list. However, [] is not.
We can represent the empty list as a difference list with:
nil :: DList a nil = id
Now in this representation we can append difference lists by just using (.):
append :: DList a -> DList a -> DList a append = (.)
and you can convert from a difference list to a list by just applying it to [].
So if these are so good, why don't we use them everywhere?
1. Well, the amortization schedule for the costs you incur while tearing down a difference list is different than for a traditional list, so you pay prices at different times.
2. The type is 'too large' in that it doesn't adequately capture the constraint that the function must use the list its given and must append it to its output and can't use the list in any other ways.
3. Difference lists do not generate thunks that remember their contents after being forced the first time. However, if all you are going to do when you are done is convert the overall result back to a normal list, then you can get that list to get the memoization-like thunking effect from the result list.
So with these in hand, we can easily see reverse' as taking a list and returning a difference list, and then we are just converting it back to a normal list using the fairly standard worker/wrapper pattern.
reverse as = reverse' as [] where reverse' :: [a] -> DList a reverse' (a:as) = reverse' as `append` singleton a reverse' [] = nil
Ok. So recalling qsort,
qsort [] = [] qsort (a:as) = qsort ls ++ [a] ++ qsort rs where (ls,rs) = partition (< =a) as
Lets apply the same transformation that we applied to reverse to the well known quicksort example, transforming its output to a difference list.
qsort' :: Ord a => [a] -> DList a qsort' [] = nil qsort' (a:as) = qsort' ls `append` singleton a `append` qsort' rs where (ls,rs) = partition (< =a) as
And after inlining the very succinct definitions and transforming the output back to a list we get a visibly similar worker/wrapper combo:
qsort :: Ord a => [a] -> [a] qsort as = qsort' as [] where qsort' [] = id qsort' (a:as) = qsort' ls . (a:) . qsort' rs where (ls,rs) = partition (< =a) as
However, we now get better performance.

Green is the time taken by the difference list version. Blue is the traditional Haskell qsort.

is denoted with "lenses"
. When the functor F can be determined unambiguously, it is usually written
or ana
.
is the final
-coalgebra for some endofunctor
.
x = ana psi
fmap y . psi = outF . y => x = y
-Algebra
are commonly denoted using "barbed wire" as
or as para f. Uustalu and Vene [2] use the notation
so that apomorphisms, the categorical dual of paramorphisms, can have a symmetric notation.
and
we have
. [2, p9, Lemma 2]
, the following diagram commutes:
is denoted
. When the functor F can be determined unambiguously, it is usually written
or cata
.
is the initial
.




.
:
is said to be
if it is naturally isomorphic to
.
with:
with a left adjoint
is represented by (FX, ηX(•)) where X = {•} is a singleton set and η is the unit of the adjunction.
, we can translate the remainder quite easily: