Description
Anamorphisms are generalizations of Haskell's unfoldr. A anamorphism builds a data structure with an F-coalgebra by recursively unrolling a seed.

History
The name anamorphism comes from the Greek 'ανα-' meaning "upwards". [1,2]

Notation
An anamorphism for some F-coalgebra $(X,\psi)$ is denoted with "lenses" $[\hspace{-.2em}( \psi )\hspace{-.2em}]_F$. When the functor F can be determined unambiguously, it is usually written $[\hspace{-.2em}( \psi )\hspace{-.2em}]$ or ana $\psi$.

Implementation

 
ana :: Functor f => Coalgebra f a -> a -> FixF f
ana g = InF . fmap (ana g) . g
 

Alternate implementations

 
ana g = hylo InF g
 

Duality

An anamorphism is the categorical dual of a catamorphism.

Derivation
If $(\nu F,\mathrm{out}_F)$ is the final $F$-coalgebra for some endofunctor $F$ and $(X,\psi)$ is an $F$-coalgebra, then there is a unique $F$-coalgebra homomorphism from $(X,\psi)$ to $(\nu F,\mathrm{out}_F)$ which we denote $[\!( \psi )\!]_F$.

That is to say, the following diagram commutes:

\bfig \square/>`>`>`>/[X`F X`\nu F`F \nu F;\psi`{[}\!{(}\psi{)}\!{]}`F {[}\!{(}\psi{)}\!{]}`\mathrm{out}_F] \efig

Laws

Rule Haskell
ana-charn fmap x . psi = outF . x $\equiv$ x = ana psi
ana-self fmap (ana psi) . psi = outF . ana psi
ana-id id = ana outF
ana-uniq fmap x . psi = outF . x $\wedge$ fmap y . psi = outF . y => x = y
ana-fusion fmap x . phi = psi . x => ana phi = ana psi . x
ana-compose e :: f :~> g => ana (e . outF) . ana phi = ana (e . phi)

Examples

 
data StreamF a x = a :> x
type Stream a = FixF (StreamF a)
 
instance Functor (StreamF a) where
    fmap f (a :> as) = a :> f as
 
repeat :: a -> Stream a
repeat = ana psi where
    psi n = n :> n
 
from :: Num a => a -> Stream a
from = ana psi where
    psi n = n :> (n + 1)
 

Links
[Haddock] [Source] [Field Guide]

References

[1] E. Meijer, M. Fokkinga, R. Paterson. Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire. Proceedings, 5th ACM Conference on Functional Programming Languages and Computer Architecture.
[2] M. Fokkinga. Law and Order in Algorithmics. PhD Thesis. 1992

I'm off to MSFP'08 (which is colocated with ICALP) and should have a day or two spare on either side of the workshop to let me find stuff to do in Iceland and meet people.

I'd love to get a chance to meet folks in person.

If anyone else is planning on being in the area, please feel free to drop me a line.

[Edit: re-introduced the domain function]

Recently Andy Gill posted a nice use of data families to memoize a narrow range of values to optimize Conal Elliott's functional linear maps.

I've tweaked Andy's definition slightly below to use a type family for D e instead of a data family, which avoids some of the syntactic noise of the domain function when D e = e, and lets it this be used transparently in some places. This change lacks substance however, and the source code links at the bottom include the code with and without this change.

Since I'm using data and type families, you'll need GHC 6.9.

 
class Applicative ((:~*) e) => NarrowMemo e where
        data (:~*) e :: * -> *
        type D e :: *
        apply :: (e :~* r) -> D e -> r
        memo :: (D e -> r) -> (e :~* r)
        domain :: D e -> e
 
instance NarrowMemo Bool where
        data Bool :~* a = MemoBool a a
        type D Bool = Bool
        apply (MemoBool o1 o2) True = o1
        apply (MemoBool o1 o2) False = o2
        memo f = MemoBool (f True) (f False)
        domain = id
 

We can quickly add in the missing bits for the Bool demo he supplied:

 
instance Functor ((:~*) Bool) where
        fmap f (MemoBool a b) = MemoBool (f a) (f b)
 
instance Applicative ((:~*) Bool) where
        pure x = MemoBool x x
        MemoBool f g  < *> MemoBool a b = MemoBool (f a) (g b)
 

And we can automatically generate an instance of Monad for (:~*)e if we really want to:

 
instance NarrowMemo e => Monad ((:~*) e) where
        return = pure
        m >>= k = memo $ apply (fmap k m) >>= apply -- in (->)e
 

And we can drop in a couple more instances to make it interesting. The tensor product:

 
instance (NarrowMemo a, NarrowMemo b) =>
    NarrowMemo (a,b) where
        data (a,b) :~* e = MemoBoth (a :~* (b :~* e))
        type D (a,b) = (D a, D b)
        apply (MemoBoth f) = uncurry (apply . apply f)
        memo f = MemoBoth $ memo $ \a -> memo (f . (,) a)
        domain = domain *** domain
 
instance (NarrowMemo a, NarrowMemo b) =>
    Functor ((:~*) (a,b)) where
        fmap f (MemoBoth g) = MemoBoth (fmap (fmap f) g)
 
instance (NarrowMemo a, NarrowMemo b) =>
    Applicative ((:~*) (a,b)) where
        pure = MemoBoth . pure . pure
        f < *> a = MemoBoth $
                memo $ \da ->
                memo $ \db ->
                let e = (da, db) in
                apply f e (apply a e)
 

and a disjoint sum, if only so we have some more memo tables to play with.

 
instance (NarrowMemo a, NarrowMemo b) =>
    NarrowMemo (Either a b) where
        data Either a b :~* e = MemoEither (a :~* e) (b :~* e)
        type D (Either a b) = Either (D a) (D b)
        apply (MemoEither l _) (Left a) = apply l a
        apply (MemoEither _ r) (Right b) = apply r b
        memo f = MemoEither (memo (f . Left)) (memo (f . Right))
        domain = domain +++ domain
 
instance (NarrowMemo a, NarrowMemo b) =>
    Functor ((:~*) (Either a b)) where
        fmap f (MemoEither a b) = MemoEither (fmap f a) (fmap f b)
 
instance (NarrowMemo a, NarrowMemo b) =>
    Applicative ((:~*) (Either a b)) where
        pure f = MemoEither (pure f) (pure f)
        MemoEither f g < *> MemoEither a b = MemoEither (f < *> a) (g < *> b)
 

Now, the reason I wanted to play with this was I hacked up a memoizing state-in-context comonad a couple of weeks back using unsafePerformIO and a memo table, but using Andy's trick we can build up a pure memoizing context comonad for smaller or more regular domains.

Recall the state-in-context comonad from category-extras.

 
class Comonad w => ComonadContext s w | w -> s where
        getC :: w a -> s
        modifyC :: (s -> s) -> w a -> a
 
data Context s a = Context (s -> a) s
 
instance ComonadContext s (Context s) where
        getC (Context _ s) = s
        modifyC m (Context f c) = f (m c)
 
instance Functor (Context s) where
        fmap f (Context f' s) = Context (f . f') s
 
instance Copointed (Context s) where
        extract   (Context f a) = f a
 
instance Comonad (Context s) where
        duplicate (Context f a) = Context (Context f) a
 

We can modify the definition to replace (->) with the version above of Andy's (:~*) to obtain a narrowly memoized version:

 
data NarrowContext e a = NarrowContext (e :~* a) (D e)
 
instance NarrowMemo e => Functor (NarrowContext e) where
        fmap f (NarrowContext t d) = NarrowContext (memo (f . apply t)) d
 
instance NarrowMemo e => Copointed (NarrowContext e) where
        extract (NarrowContext t d) = apply t d
 
instance NarrowMemo e => Comonad (NarrowContext e) where
        duplicate (NarrowContext f a) =
                NarrowContext (memo (NarrowContext f)) a
 
instance NarrowMemo e => ComonadContext (D e) (NarrowContext e a) where
        getC (NarrowContext _ e) = e
        modifyC f (NarrowContext g e) = apply g (f e)
 

Now any computation done in this modified state-in-context comonad is memoized and compositions of those computations are also memoized and built up from the memoized intermediate steps.

We should be able to play similar games with a lot of other (co)monads that use exponentials as well.

For instance (:~*) itself can play the role of a narrowly-memoized anonymous exponential comonad.

 
instance (NarrowMemo e, Monoid (D e)) => Copointed ((:~*) e) where
        extract t = apply t mempty
 
instance (NarrowMemo e, Monoid (D e)) => Comonad ((:~*) e) where
        duplicate f = memo $ m -> memo (apply f . mappend m)
 

[ Source Code (using data families only) ]
[ Source Code (using data and type families) ]

Description
A paramorphism is used when a catamorphism won't quite do because you need not only the result of recursing with the F-algebra over the tail of a structure, but you also need the tail itself. This is another way to say that you call in a paramorphism when you need the power of primitive recursion.

History
Lambert Meertens coined the name paramorphism [1]. The name comes from the greek παρα meaning "alongside", the root of parallel.

Notation
A paramorphisms defined in terms of an $F$-$(\mu F \times)$-Algebra $(X,\varphi)$ are commonly denoted using "barbed wire" as ${[}\!\!\langle f \rangle\!\!{]}$ or as para f. Uustalu and Vene [2] use the notation $\langle\!| f |\!\rangle$ so that apomorphisms, the categorical dual of paramorphisms, can have a symmetric notation.

Implementation

 
type Para f = (,) (FixF f)
-- para :: Functor f => GAlgebra f (Para f) a -> FixF f -> a
para :: Functor f => (f (FixF f,a) -> a) -> FixF f -> a
para f = f . fmap (id &&& para f) . outF -- from para-cancel
 

Alternate Implementations

 
para f = fst . cata (f &&& InF . fmap snd) -- para-def
para f = hylo f (fmap dup . outF) where dup = id &&& id
para = zygo InF
 

Laws

Rule Haskell
para-def para phi = fst . cata (phi &&& InF . fmap snd)
para-charn f . InF = phi . fmap (f &&& id) <=> f = para phi
para-cancel para phi . InF = phi . fmap (para phi &&& id)
para-refl para (InF . fmap fst) = id
para-fusion f . phi = psi . fmap (f *** id) => f . para phi = para psi
para-cata cata phi = para (phi . fmap fst)
para-any f = para (f . InF . fmap snd)
para-out para (fmap snd) = outF

Derivation

For any two morphisms $f : \mu F -> X$ and $\varphi : F (X \times \mu F) -> X$ we have $f \circ \mathrm{in}_F = \varphi \circ F \langle f, \mathrm{id}_{\mu F}\rangle \Leftrightarrow f = \mathrm{fst} \circ (\!| \langle \varphi, \mathrm{in}_F \circ F \mathrm{snd} \rangle |\!)$. [2, p9, Lemma 2]

In other words, if we define ${[}\!\!\langle \varphi \rangle\!\!{]} = f$, the following diagram commutes:

\bfig \square/>`>`>`>/[F \mu F`\mu F`F (X \times \mu F)`X;\mathrm{in}_F`F \langle {[}\!\!\langle \varphi \rangle\!\!{]}, id_{\mu F}\rangle`{[}\!\!\langle \varphi \rangle\!\!{]}`\varphi] \efig

Example

 
data NatF a = S a | Z deriving (Eq,Show)
type Nat = FixF NatF
 
instance Functor NatF where
     fmap f Z = Z
     fmap f (S z) = S (f z)
 
plus :: Nat -> Nat -> Nat
plus n = cata phi where
     phi Z = n
     phi (S m) = s m
 
times :: Nat -> Nat -> Nat
times n = cata phi where
     phi Z = z
     phi (S m) = plus n m
 
z :: Nat
z = InF Z
 
s :: Nat -> Nat
s = InF . S
 
fac :: Nat -> Nat
fac = para phi where
     phi Z = s z
     phi (S (n,f)) = times f (s n)
 

Links
[Haddock] [Source] [Field Guide]

References

[1] L. Meertens. Paramorphisms. Formal Aspects of Computing, 4(5):413-424, 1992.
[2] T. Uustalu and V. Vene. Primitive (Co)Recursion and Course-of-Value (Co)Iteration, Categorically. Informatica, 1999 Vol. 10, No. 1, 1-0

Description
Catamorphisms are generalizations of Haskell's foldr. A catamorphism deconstructs a data structure with an F-algebra.

History
The name catamorphism appears to have been chosen by Lambert Meertens [1]. The category theoretic machinery behind these was resolved by Grant Malcolm [2,3], and they were popularized by Meijer, Fokkinga and Paterson [4,5]. The name comes from the Greek 'κατα-' meaning "downward or according to". A useful mnemonic is to think of a catastrophe destroying something.

Notation
A catamorphism for some F-algebra $(X,f)$ is denoted $(\!| f |\!)_F$. When the functor F can be determined unambiguously, it is usually written $(\!| \varphi |\!)$ or cata $\varphi$.

Implementation

 
cata :: Functor f => Algebra f a -> FixF f -> a
cata f = f . fmap (cata f) . outF
 

Alternate implementations

 
cata f = hylo f outF
cata f = para (f . fmap fst)
 

Duality

A catamorphism is the categorical dual of an anamorphism.

Derivation
If $(\mu F,\mathrm{in}_F)$ is the initial $F$-algebra for some endofunctor $F$ and $(X,\varphi)$ is an $F$-algebra, then there is a unique $F$-algebra homomorphism from $(\mu F,\mathrm{in}_F)$ to $(X,\varphi)$ which we denote $(\!| \varphi |\!)_F$.

That is to say, the following diagram commutes:

\bfig \square/>`>`>`>/[F \mu F`\mu F`F X`X;\mathrm{in}_F`F {(}\!{|}\varphi{|}\!{)}`{(}\!{|}\varphi{|}\!{)}`\varphi] \efig

Laws

Rule Categorically Haskell
cata-cancel ${(}\!{|}\varphi{|}\!{)}_F \circ \mathrm{in}_F = \varphi \circ F {(}\!{|}\varphi{|}\!{)}_F$ cata phi . InF = phi . fmap (cata phi)
cata-refl ${(}\!{|}\mathrm{in}_F{|}\!{)}_F = \mathrm{id}_{\mu F}$ cata InF = id
cata-fusion $\inference{f \circ \varphi = \varphi \circ F f}{f \circ {(}\!{|}\varphi{|}\!{)}_F = {(}\!{|}\varphi{|}\!{)}_F}$ f . phi = phi . fmap f =>
f . cata phi = cata phi
cata-compose $\inference{\varepsilon : F \stackrel{\centerdot}{->} G}{(\!| \varphi |\!)_G \circ (\!|\mathrm{in}_G \circ \varepsilon|\!)_F = (\!|\varphi \circ \varepsilon|\!)_F}$ eps :: f :~> g =>
cata phi . cata (In . eps) =
cata (phi . eps)


Examples

 
data StrF x = Cons Char x | Nil
type Str = FixF StrF
 
instance Functor StrF where
    fmap f (Cons a as) = Cons a (f as)
    fmap f Nil = Nil
 
length :: Str -> Int
length = cata phi where
    phi (Cons a b) = 1 + b
    phi Nil = 0
 
data NatF a = S a | Z deriving (Eq,Show)
type Nat = FixF NatF
 
instance Functor NatF where
     fmap f Z = Z
     fmap f (S z) = S (f z)
 
plus :: Nat -> Nat -> Nat
plus n = cata phi where
     phi Z = n
     phi (S m) = s m
 
times :: Nat -> Nat -> Nat
times n = cata phi where
     phi Z = z
     phi (S m) = plus n m
 
z :: Nat
z = InF Z
 
s :: Nat -> Nat
s = InF . S
 

Links
[Haddock] [Source] [Field Guide]

References

[1] L. Meertens. First Steps towards the theory of Rose Trees. Draft Report, CWI, Amsterdam, 1987.
[2] G. Malcolm. PhD Thesis. University of Gronigen, 1990.
[3] G. Malcolm. Data structures and program transformation. Science of Computer Programming, 14:255--279, 1990.
[4] E. Meijer. Calculating Compilers, Ph.D thesis, Utrecht State University 1992.
[5] E. Meijer, M. Fokkinga, R. Paterson, Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire. Proceedings, 5th ACM Conference on Functional Programming Languages and Computer Architecture.

I talk an awful lot about different recursion schemes in this blog.

What I want to do for the next few posts is taxonomize a list of the general recursion schemes I have seen, cite the original references for them where available, and provide a concise implementation of each and try to motivate the connections between them. I'll likely go back through and add more information or post motivating examples for each as I work.

The following recursion schemes can be found in category-extras, along with variations on the underlying themes, so this should work as a punch-list. I'll update this post with links as I go.

Folds
Scheme Code Description
catamorphism Cata tears down a structure level by level
paramorphism*† Para tears down a structure with primitive recursion
zygomorphism*† Zygo tears down a structure with the aid of a helper function
histomorphism† Histo tears down a structure with the aid of the previous answers it has given.
prepromorphism*† Prepro tears down a structure after repeatedly transforming it
Unfolds
Scheme Code Description
anamorphism Ana builds up a structure level by level
apomorphism*† Apo builds up a structure opting to return a single level or an entire branch at each point
futumorphism† Futu builds up a structure multiple levels at a time
postpromorphism*† Postpro builds up a structure and repeatedly transforms it
Refolds
Scheme Code Description
hylomorphism† Hylo builds up and tears down a virtual structure
chronomorphism† Chrono builds up a virtual structure with a futumorphism and tears it down with a histomorphism
synchromorphism Synchro a high level transformation between data structures using a third data structure to queue intermediate results
exomorphism Exo a high level transformation between data structures from a trialgebra to a bialgebra
metamorphism Erwig a hylomorphism expressed in terms of bialgebras
metamorphism Gibbons A fold followed by an unfold; change of representation
dynamorphism† Dyna builds up a virtual structure with an anamorphism and tears it down with a histomorphism
Elgot algebra Elgot builds up a structure and tears it down but may shortcircuit the process during construction
Elgot coalgebra Elgot builds up a structure and tears it down but may shortcircuit the process during deconstruction

* This gives rise to a family of related recursion schemes, modeled in category-extras with distributive law combinators.
† The scheme can be generalized to accept one or more F-distributive (co)monads.

(more...)

As you may recall, every functor in Haskell is strong, in the sense that if you provided an instance of Monad for that functor the following definition would satisfy the requirements mentioned here:

 
strength :: Functor f => a -> f b -> f (a,b)
strength = fmap . (,)
 

In an earlier post about the cofree comonad and the expression problem, I used a typeclass defining a form of duality that enables you to let two functors annihilate each other, letting one select the path whenever the other offered up multiple options. To have a shared set of conventions with the material in Zipping and Unzipping Functors, I have since remodeled that class slightly:

 
class Zap f g | f -> g, g -> f where
	zapWith :: (a -> b -> c) -> f a -> g b -> c
	zap :: f (a -> b) -> g a -> b
	zap = zapWith id
 

Interestingly, we can use the fact that every functor in Haskell is strong to derive not only one instance of Zap, but two, given any Adjunction $f \dashv g$.

I'll give one instance here:

 
zapWithAdjunctionGF :: Adjunction g f =>
    (a -> b -> c) -> f a -> g b -> c
zapWithAdjunctionGF f a b =
    uncurry (flip f) . counit . fmap (uncurry (flip strength)) $
    strength a b
 

And I'll leave the substantially similar derivation of the following to the reader:

 
zapWithAdjunctionFG :: Adjunction f g => (a -> b -> c) -> f a -> g b -> c
 

Now, we would like to do the same thing we did with Representable last time, and just require the user of the Adjunction class to provide us with more instances, something like:

 
class (Zap f g, Zap g f, Representable g (f Void), Functor f) =>
    Adjunction f g | f -> g, g -> f where
        ...
 

But there is a problem: Adjunction composition.

If you will recall from before, we were able to define an instance for an Adjunction for a composition of two adjunctions:

 
newtype O f g a = Compose { decompose :: f (g a) }
 
instance (Functor f, Functor g) => Functor (f `O` g) where
        fmap f = Compose . fmap (fmap f) . decompose
 
instance (Adjunction f1 g1, Adjunction f2 g2) =>
    Adjunction (f2 `O` f1) (g1 `O` g2) where
        counit =
               counit .
               fmap (counit . fmap decompose) .
               decompose
        unit =
               Compose .
               fmap (fmap Compose . unit) .
               unit
 

The problem is that we have three different ways to build an instance of Zap for this composition and we would need to define two of them that conflict!

We would need both of these, which would lead to ambiguous instance heads:

 
instance (Adjunction f1 g1, Adjunction f2 g2) =>
    Zap (f2 `O` f1) (g1 `O` g2) where
        zapWith = zapWithAdjunctionFG
instance (Adjunction f1 g1, Adjunction f2 g2) =>
    Zap (g1 `O` g2) (f2 `O` f1) where
        zapWith = zapWithAdjunctionGF
 

Furthermore, we can also define a third instance of Zap over composition, which doesn't even care about Adjunction, which also conflicts with the above:

 
instance (Zap f g, Zap h k) => Zap (f `O` h) (g `O` k) where
    zapWith f a b =
        zapWith (zapWith f) (decompose a) (decompose b)
 

We could use the standard Haskell trick of making different compositions based on which instance of Zap you want to support, but the combinatorial explosion of constructors here when combined with the other reasons you may want to compose a pair of functors leads to a bit of absurdity, especially since I'm using it to capture a relationship no one cares about.

Consequently, category-extras does not capture this constraint.

Cozapping

As a final aside, we noted previously that Traversable functors were costrong. If a strong Adjunction gives rise to a couple of instances of Zap, we'd expect a similar relationship between a notion of Cozap and a costrong Adjunction.

But what would cozapping be?

First lets take a step back and break down zipWith into a couple of steps. If we note that zapWith (,) looks like:

 
prezapAdjunctionGF :: Adjunction g f => f a -> g b -> (a,b)
prezapAdjunctionGF a b =
    swap . counit . fmap (uncurry strength . swap) $ strength a b
    where swap ~(a,b) = (b,a)
 

Whereupon we can run the output through the canonical eval morphism for exponentials in $\mathbf{Hask}$:

 
eval :: (a -> b, a) -> b
eval (f,a) = f a
 

We can run everything backwards (modulo the noise caused by currying) in the first definition above and get:

 
precozapAdjunctionFG ::
    (Adjunction f g, Traversable f, Traversable g) =>
    Either a b -> Either (f a) (g b)
precozapAdjunctionFG =
    costrength . fmap (swap . costrength) . unit . swap
 

However, we lack a coeval morphism for coexponentials, since $\mathbf{Hask}$ lacks coexponentials -- with good reason! If $\mathbf{Hask}$ was a co-CCC then it would degenerate to a rather boring poset.

But that said, even getting this far, how many adjunctions are there between Traversable functors in Haskell, really?

I've had a few people ask me questions about Adjunctions since my recent post and a request for some more introductory material, so I figured I would take a couple of short posts to tie Adjunctions to some other concepts.

Representable Functors

A covariant functor $F : \mathcal{C} -> \mathbf{Set}$ is said to be representable by an object $x \in \mathcal{C}$ if it is naturally isomorphic to $\mathbf{Hom}_C(x,-)$.

We can translate that into Haskell, letting $\mathbf{Hask}$ play the role of $\mathbf{Set}$ with:

 
class Functor f => Representable f x where
    rep :: (x -> a) -> f a
    unrep :: f a -> (x -> a)
 
{-# RULES
"rep/unrep" rep . unrep = id
"unrep/rep" unrep . rep = id
 #-}
 

It is trivial to show that any two representations of a given functor must be isomorphic, and that there is a natural isomorphism between any two functors with the same representation, so we could strengthen the signature of the type class above by adding a pair of functional dependencies: f -> x, x -> f, but lets work without this straightjacket for now.

Example

We can represent the anonymous reader monad with its environment.

 
instance Representable ((->)x) x where
    rep = id
    unrep = id
 

Example

We could adopt the pleasant fiction that () has a single inhabitant to avoid bringing in an empty type, but lets do this correctly. Clearly the Identity functor needs no extra information from its representation.

 
data Void {- you'll need EmptyDataDecls -}
 
instance Representable Identity Void where
    rep f = Identity (f undefined)
    unrep = const . runIdentity
 

Adjunctions

If you recall the definition of Adjunctions over $\mathbf{Hask}$ from before:

 
class (Functor f, Functor g) =>
    Adjunction f g | f -> g, g -> f where
        unit   :: a -> g (f a)
        counit :: f (g a) -> a
        leftAdjunct  :: (f a -> b) -> a -> g b
        rightAdjunct :: (a -> g b) -> f a -> b
 
        unit = leftAdjunct id
        counit = rightAdjunct id
        leftAdjunct f = fmap f . unit
        rightAdjunct f = counit . fmap f
 

We can generate a lot of representable functors, by turning to the theorem mentioned in the Wikipedia article about the representability of a right adjoint in terms of its left adjoint wrapped around a singleton element:

Any functor $K : \mathcal{C} -> \mathbf{Set}$ with a left adjoint $F : \mathbf{Set} -> \mathcal{C}$ is represented by (FX, ηX(•)) where X = {•} is a singleton set and η is the unit of the adjunction.

Well, earlier we defined a singleton set, Void, and $\mathbf{Hask}$ can play the role of $\mathbf{Set}$ as we did above. For $\mathcal{C} = \mathbf{Hask}$, we can translate the remainder quite easily:

 
repAdjunction :: Adjunction f g => (f Void -> a) -> g a
repAdjunction f = leftAdjunct f undefined
 
unrepAdjunction :: Adjunction f g => g a -> (f Void -> a)
unrepAdjunction = rightAdjunct . const
 

Now, as usual the way type class inference works in Haskell requires us to reason somewhat backwards.

You'd like to say:

 
instance Adjunction f g => Representable g (f Void) where
        rep = repAdjunction;
        unrep = unrepAdjunction
 

But if you do so, you can't define any other instances for Representable, you'll have used up the instance head, so the previous definitions couldn't be made.

On the other hand, you can create the obligation for an appropriate instance of Representable by changing the signature of Adjunction:

 
class (Representable g (f Void), Functor f) =>
    Adjunction f g | f -> g, g -> f where
        ...
 

Then the definitions for repAdjunction and unrepAdjunction can be used by any would-be Adjunction to automatically generate the corresponding Representable instance, just like liftM can alwyas be used to make a Haskell Monad into a Functor.

We can also go the other way and define an Adjunction given a representation for the right adjoint, but I'll leave that as an exercise for the reader. (Hint: you'll probably want to weaken the signature for Adjunction to remove the fundeps, so you can test some simple cases). You may also want to take a look at the section on Adjunctions as Kan extensions portion of the earlier post.

I have since modified category-extras definition of Adjunction to require the instance for Representable motivated above.

Haddock:
[Control.Functor.Adjunction]
[Control.Functor.Representable]

This post is a bit of a departure from my recent norm. It contains no category theory whatsoever. None. I promise.

Now that I've bored away the math folks, I'll point out that this also isn't a guide to better horticulture. Great, there goes the rest of you.

Instead, I want to talk about Bloom filters, Bloom joins for distributed databases and some novel extensions to them that let you trade in resources that we have in abundance for ones that are scarce, which I've been using for the last few months and which I have never before seen before in print. Primarily because I guess they have little to do with the strengths of Bloom filters.

For practical purposes you will need to use a counted or spectral Bloom filter for the purposes of the structure mentioned below. However, as these introduce nothing novel, and simply muddle the exposition, I'll ignore counting and spectral Blooms for now.

Bloom Filters

Ok, so what is a Bloom filter? Bloom filters date back to 1970. A simple Bloom filter is a novel data structure for approximating membership in a set, yielding only false positives. A filter consists of an m-bit array and k distinct hash functions. To add an element to the filter you run it through each of the k hash functions and setting the appropriate bits. A value is considered to be a member of the set if you hash it through each of the k functions and each of the target bits is set. It is easy to see that this can only result in a false positive, but its also easy to see that you need to set the size of the array before you start adding elements to it, and that you need to balance the number of hash functions to the overall desired precision of your filter. In general you want to have about half of the bits set in the resulting array to maximize your information density -- a fact which can be derived with elementary calculus. From which you can figure out that you get optimal results when $k = \frac{m}{n} \ln 2$.

We can readily approximate k distinct hashing functions by using a single one-way hashing function and carving it up into a number of hashing functions that consist of the right number of bits each. A simpler approach due to Kirsch and Mitzenmacher is to sacrifice the independence of the hash functions without particularly adversely affecting the properties of the filter.

The nice thing about a Bloom filter is that the parameters m and k can be varied to tune space requirements and precision.

Improving Locality

One common way to improve the locality of reference for excessively large Bloom filters is to break up the structure into two tiers. You have an upper tier in which you use a single hash function to bin the data, then within the bin you placed the data you run the remaining k-1 hash functions. This can result in a 'lumpier' distribution of data, but generally improves performance because if you exceed working memory, this model can typically page in a single page from disk to handle the k-1 writes. When you figure that it is common to use between several hashing functions with a bloom filter this can result in a several-fold performance improvement as the data set grows and you become IO bound. As a result of being primarily to optimize IO you typically want to have a bin size that corresponds with your block or page size.

As an admittedly completely unintelligible aside, I am particularly fond of 8k bins for a simple Bloom filter, because they nicely consume 16 bits of hash evenly, and 4k bins, when used with 4 bit counting Blooms, page in and out efficiently and compress nicely with an arithmetic/exponentiated Huffman encoding into near even multiples of the ethernet packet MTU when you tune the ratio of set bits carefully, I've found this to be beneficial for tweaking real world performance.

Bloom Joins

Given a pair of bloom filters that share a given size m and which use the same k hash functions. You can take their intersection (or union) quite efficiently with bitwise and (or or). This is a well known technique for dealing with distributed database joins when you have data distributed across multiple servers joining against data distributed across other servers. In general, you are only interested in transmitting the data that exists on both sides of the join.

(You can technically free yourself from the requirement that both sides agree on the number of hash functions if you are willing to accept more false positives and you test for membership in the result set using just the hash functions contained in both Blooms. The easiest way to do this is to just agree on an order in which hash functions will be used, which comes for free from the Kirsch/Mitzenmacher approach mentioned above.)

The good
The nice thing about a standard Bloom join is that you can send the Bloom filter over the network quite cheaply in comparison to the data, and with the addition of counting Bloom filter tricks it can be used to calculate approximately the size of the result set. This allows you to use it to load level MapReduce style workloads effectively by estimating the size of intermediate results quite accurately before you send everything over the network to be aggregated.

The bad
One problem with this model is that you have to know the size of your data set up front in order to calculate an ideal m for a desired precision level. Moreover both sides of the join have to agree on this figure m before calculating the join.

Now, the main goal of a Bloom join is to conserve an scarce resource (network bandwith) by exchanging cheaper, more plentiful resources (local CPU utilization, and disk IO). In that respect it serves adequately, but we can do better if our goal is more or less purely to optimize network bandwidth. Lets carry that a bit further.

Linear Hash Tables

To address the limitation that you have to know the size of the bloom a priori, we'll turn to another data structure, the linear hash table. Linear hash tables were designed by Witold Litwin back in 1980 to provide an expandable hash table without a huge stairstep in the cost function whenever you hit a power of two in size. The basic idea of a linear hash table is that you grow the table gradually, by splitting one bucket at a time and using the least significant bits of your hash function.

For sake of variety, I've included a C# 3.5 implementation here:

[SortedLinearHashTable.cs]
[SinglyLinkedList.cs]
[PreparedEqualityComparer.cs]
[PreparedEqualityComparerTypeProxyAttribute.cs]

For my regular audience, an implementation in Haskell using STM — incidentally was the first piece of Haskell I ever wrote — designed for read-mostly use can be found here:

[haddock]
[darcs]

A Linear Bloom Filter

Now, we can look at the bi-level structure we introduced above for dealing with improved cache locality and note that we could go in a different direction and treat the upper level as a linear hash table, instead of a simple hash function! This requires that we keep not only the Bloom but also the member list (or at least their hashes). We can optimize this slightly by computing the Bloom of the member list for each page lazily. This costs us quite a bit of storage relative to a traditional Bloom filter, but we can transmit the Bloom of the resulting set over the network more cheaply than we can transmit a linear hash table and it isn't appreciably more expensive locally than a linear hash table due to only lazily constructing the Blooms.

This mechanism gives rise to an actual tree of pages based on the unfolding of the linear hash table in the resulting hierarchical bloom if you choose to represent the interior of the tree.

Again, this isn't a win for all scenarios, but if you are intending to transmit the resulting set over the network, and don't know its size a priori, the combination of properties from the linear hash table and the bloomed pages leads to some interesting options.

Linear Bloom Origami

Now that we have an expandable hash in our top level, we finally have the machinery to deal with how to perform a join between two linear bloom filters of different size. The model is actually quite simple. We can fold the larger bloom up by oring together the leaves that were split by the linear hash table in the larger bloom until we have the same number of pages and then perform a standard Bloom join. This frees us from the tyranny of having to have both sides of the join guess in advance a shared number of buckets to use to perform the join.

As an aside, an interesting thought experiment is to go one step further and use a full-fledged sorted linear hash table for the extra cost of sorting the chains, but this doesn't seem to be useful in practice.

Mipmapping Blooms

If we are willing to pay an $\mathcal{O}(n \log n)$ cost in terms of the data set size cost in terms of CPU utilization and memory bandwidth we can gain some further performance in terms of network utilization through encoding a set of "mipmaps" for our filters.

Basically the idea is to fold up the tree by oring together the pages into an admissible Bloom of the dataset. Then you encode the splitting of each bit that was set in the Bloom using conditional probabilities. This can be transmitted near optimally using arithmetic encoding or exponentiated Huffman.

If a bit is set in the parent Bloom, then at least one of the two bits will be set in the child Blooms; if no bit is set in the parent Bloom, then no bit can be set in the child Blooms. The probability of each bit being set in each child is for all practical intents and purposes independent and can reasonably be modeled as a function of the expected number of set bits. (This is ever-so-slightly suboptimal if the overall number of values is known). You can determine exact values for the weights of each of the three cases using conditional probabilities and then use an arithmetic compressor, or exponentiate the alphabet for a Huffman compressor — this is otherwise near worst-case for Huffman, since you have two possibilities both just shy of 50% and one much smaller probability. Nicely the regular structure of the exponentiated alphabet is very regular and can be represented efficiently. With careful choice of page size (or bit density within a page) you can transmit the initial page cheaply, and then pack multiple pages into subsequent packets.

Since we can determine the relevance of portions of the tree based on partial information this may allow you to avoid transmitting some branches of the tree. More interestingly we can use it to figure out approximately the size of the join set from the first few pages transmitted and to gain gradual refinements as both sides of the join supply more information.

If you wanted to optimize strictly for network bandwidth and were willing to accept additional latency you could prune branches of the tree after it was clear that the intersection was empty and so no further resolution was required, but in my experience this optimization doesn't seem to be worth the effort.

Incremental Update

Interestingly if you have already shared a Linear Bloom and need to update your copy it admits a cheap network representation using the same arithmetic/exponentiated Huffman encoding trick mentioned earlier. You lose the ability to ignore all unset bits in the dataset because extending the set of known values will in all likelihood set new bits, but as you add members you can transmit splits using the same mechanism used above, and you have the actual member set needed to populate the child pages accurately.

Interestingly it is the ability to mipmap the intermediate results that sometimes makes it worth dealing with a suboptimal choice of density for the overall Bloom filter, because it only affects the cost on either end of the network, it doesn't affect the network transmission costs all that adversely and more sparse population early in the tree can allow you to have a less oversaturated tree near the root, allowing earlier pruning of branches - I have yet to take this from an art to a science.

Conclusion

I had intended to explain things in more detail and delve into the asymptotic behavior of hierarchical and linear Blooms, but various people have been hammering me to just post this already, so here it is.

So to recap, we took a normal (or counted or spectral) Bloom filter, crossbred it with Litwin's linear hash table and found that the mutant offspring is an approximation of a set that is better suited to sharing over the network than either structure alone, with a memory usage profile similar to that of a linear hash table. Interestingly as a side effect you can go one step further and allow for transmission of a requested subset of the exact hashes present in which case we've really only used the Blooms to provide partial information about the underlying linear hash table, which can aid in the subsequent join process.

And yes, they are probably better named something like Bloomed linear hash tables, but that doesn't roll off the tongue.

If there is enough interest and I don't get dragged into other things, I might see about packaging up and genericizing some code that I had lying around intended for production use into a more general purpose library for Linear Bloom Filters.

Grant B. asked me to post the derivation for the right and left Kan extension formula used in previous Kan Extension posts (1,2). For that we can turn to the definition of Kan extensions in terms of ends, but first we need to take a couple of steps back to find a way to represent (co)ends in Haskell.

Dinatural Transformations

Rather than repeat the definition here, we'll just note we can define a dinatural transformation in Haskell letting polymorphism represent the family of morphisms.

 
type Dinatural f g = forall a. f a a -> g a a
 

Ends

So what is an end?

An end is a universal dinatural transformation from some object e to some functor s.

Diving into the formal definition:

Given a functor $F : \mathcal{C}^{op} \times \mathcal{C} -> \mathcal{D}$, and end of $F$ is a pair $(e,\omega)$ where $e$ is an object of $\mathcal{D}$ and omega is a dinatural transformation from e to S such that given any other dinatural transformation $\beta$ to S from another object x in $\mathcal{D}$, there exists a unique morphism $h : x -> e$, such that $\beta_a = \omega_a \cdot h$ for every $a$ in $\mathcal{C}$.

We usually choose to write ends as $e = \int_c S(c,c)$, and abuse terminology calling $e$ the end of