The Comonad.Reader » Algorithms

Fast Circular Substitution

Edward Kmett — Tue, 30 Dec 2014 19:47:45 +0000

Emil Axelsson and Koen Claessen wrote a functional pearl last year about Using Circular Programs for Higher-Order Syntax.

About 6 months ago I had an opportunity to play with this approach in earnest, and realized we can speed it up a great deal. This has kept coming up in conversation ever since, so I've decided to write up an article here.

In my bound library I exploit the fact that monads are about substitution to make a monad transformer that manages substitution for me.

Here I'm going to take a more coupled approach.

To have a type system with enough complexity to be worth examining, I'll adapt Dan Doel's UPTS, which is a pure type system with universe polymorphism. I won't finish the implementation here, but from where we get it should be obvious how to finish the job.

Unlike Axelsson and Claessen I'm not going to bother to abstract over my name representation.

To avoid losing the original name from the source, we'll just track names as strings with an integer counting the number of times it has been 'primed'. The name is purely for expository purposes, the real variable identifier is the number. We'll follow the Axelsson and Claessen convention of having the identifier assigned to each binder be larger than any one bound inside of it. If you don't need he original source names you can cull them from the representation, but they can be useful if you are representing a syntax tree for something you parsed and/or that you plan to pretty print later.

 
data Name = Name String Int
   deriving (Show,Read)
 
hint :: Name -> String
hint (Name n _) = n
 
nameId :: Name -> Int
nameId (Name _ i) = i
 
instance Eq Name where
  (==) = (==) `on` nameId
 
instance Ord Name where
  compare = compare `on` nameId
 
prime :: String -> Int -> Name
prime n i = Name n (i + 1)

So what is the language I want to work with?

 
type Level = Int
 
data Constant
  = Level
  | LevelLiteral {-# UNPACK #-} !Level
  | Omega
  deriving (Eq,Ord,Show,Read,Typeable)
 
data Term a
  = Free a
  | Bound {-# UNPACK #-} !Name
  | Constant !Constant
  | Term a :+ {-# UNPACK #-} !Level
  | Max  [Term a]
  | Type !(Term a)
  | Lam   {-# UNPACK #-} !Name !(Term a) !(Term a)
  | Pi    {-# UNPACK #-} !Name !(Term a) !(Term a)
  | Sigma {-# UNPACK #-} !Name !(Term a) !(Term a)
  | App !(Term a) !(Term a)
  | Fst !(Term a)
  | Snd !(Term a)
  | Pair !(Term a) !(Term a) !(Term a)
  deriving (Show,Read,Eq,Ord,Functor,Foldable,Traversable,Typeable)

That is perhaps a bit paranoid about remaining strict, but it seemed like a good idea at the time.

We can define capture avoiding substitution on terms:

 
subst :: Eq a => a -> Term a -> Term a -> Term a
subst a x y = y >>= \a' ->
  if a == a'
    then x
    else return a'

Now we finally need to implement Axelsson and Claessen's circular programming trick. Here we'll abstract over terms that allow us to find the highest bound value within them:

 
class Bindable t where
  bound :: t -> Int

and instantiate it for our Term type

 
instance Bindable (Term a) where
  bound Free{}        = 0
  bound Bound{}       = 0 -- intentional!
  bound Constant{}    = 0
  bound (a :+ _)      = bound a
  bound (Max xs)      = foldr (\a r -> bound a `max` r) 0 xs
  bound (Type t)      = bound t
  bound (Lam b t _)   = nameId b `max` bound t
  bound (Pi b t _)    = nameId b `max` bound t
  bound (Sigma b t _) = nameId b `max` bound t
  bound (App x y)     = bound x `max`  bound y
  bound (Fst t)       = bound t
  bound (Snd t)       = bound t
  bound (Pair t x y)  = bound t `max` bound x `max` bound y

As in the original pearl we avoid traversing into the body of the binders, hence the _'s in the code above.

Now we can abstract over the pattern used to create a binder in the functional pearl, since we have multiple binder types in this syntax tree, and the code would get repetitive.

 
binder :: Bindable t =>
  (Name -> t) ->
  (Name -> t -> r) ->
  String -> (t -> t) -> r
binder bd c n e = c b body where
  body = e (bd b)
  b = prime n (bound body)
 
lam, pi, sigma :: String -> Term a -> (Term a -> Term a) -> Term a
lam s t   = binder Bound (`Lam` t) s
pi s t    = binder Bound (`Pi` t) s
sigma s t = binder Bound (`Sigma` t) s

We may not always want to give names to the variables we capture, so let's define:

lam_, pi_, sigma_ :: Term a -> (Term a -> Term a) -> Term a
lam_   = lam "_"
pi_    = pi "_"
sigma_ = sigma "_"

Now, here's the interesting part. The problem with Axelsson and Claessen's original trick is that every substitution is being handled separately. This means that if you were to write a monad for doing substitution with it, it'd actually be quite slow. You have to walk the syntax tree over and over and over.

We can fuse these together by making a single pass:

 
instantiate :: Name -> t -> IntMap t -> IntMap t
instantiate = IntMap.insert . nameId
 
rebind :: IntMap (Term b) -> Term a -> (a -> Term b) -> Term b
rebind env xs0 f = go xs0 where
  go = \case
    Free a       -> f a
    Bound b      -> env IntMap.! nameId b
    Constant c   -> Constant c
    m :+ n       -> go m :+ n
    Type t       -> Type (go t)
    Max xs       -> Max (fmap go xs)
    Lam b t e    -> lam   (hint b) (go t) $ \v ->
      rebind (instantiate b v env) e f
    Pi b t e     -> pi    (hint b) (go t) $ \v ->
      rebind (instantiate b v env) e f
    Sigma b t e  -> sigma (hint b) (go t) $ \v ->
      rebind (instantiate b v env) e f
    App x y      -> App (go x) (go y)
    Fst x        -> Fst (go x)
    Snd x        -> Snd (go x)
    Pair t x y   -> Pair (go t) (go x) (go y)

Note that the Lam, Pi and Sigma cases just extend the current environment.

With that now we can upgrade the pearl's encoding to allow for an actual Monad in the same sense as bound.

 
instance Applicative Term where
  pure = Free
  (< *>) = ap
 
instance Monad Term where
  return = Free
  (>>=) = rebind IntMap.empty

To show that we can work with this syntax tree representation, let's write an evaluator from it to weak head normal form:

First we'll need some helpers:

 
apply :: Term a -> [Term a] -> Term a
apply = foldl App
 
rwhnf :: IntMap (Term a) ->
  [Term a] -> Term a -> Term a
rwhnf env stk     (App f x)
  = rwhnf env (rebind env x Free:stk) f
rwhnf env (x:stk) (Lam b _ e)
  = rwhnf (instantiate b x env) stk e
rwhnf env stk (Fst e)
  = case rwhnf env [] e of
  Pair _ e' _ -> rwhnf env stk e'
  e'          -> Fst e'
rwhnf env stk (Snd e)
  = case rwhnf env [] e of
  Pair _ _ e' -> rwhnf env stk e'
  e'          -> Snd e'
rwhnf env stk e
  = apply (rebind env e Free) stk

Then we can start off the whnf by calling our helper with an initial starting environment:

 
whnf :: Term a -> Term a
whnf = rwhnf IntMap.empty []

So what have we given up? Well, bound automatically lets you compare terms for alpha equivalence by quotienting out the placement of "F" terms in the syntax tree. Here we have a problem in that the identifiers we get assigned aren't necessarily canonical.

But we can get the same identifiers out by just using the monad above:

 
alphaEq :: Eq a => Term a -> Term a -> Bool
alphaEq = (==) `on` liftM id

It makes me a bit uncomfortable that our monad is only up to alpha equivalence and that liftM swaps out the identifiers used throughout the entire syntax tree, and we've also lost the ironclad protection against exotic terms.

But overall, this is a much faster version of Axelsson and Claessen's trick and it can be used as a drop-in replacement for something like bound in many cases, and unlike bound, it lets you use HOAS-style syntax for constructing lam, pi and sigma terms.

With pattern synonyms you can prevent the user from doing bad things as well. Once 7.10 ships you'd be able to use a bidirectional pattern synonym for Pi, Sigma and Lam to hide the real constructors behind. I'm not yet sure of the "best practices" in this area.

Here's the code all in one place:

[Download Circular.hs]

Happy Holidays,
-Edward

Mirrored Lenses

Edward Kmett — Mon, 25 Jun 2012 03:38:41 +0000

Lenses are a great way to deal with functional references, but there are two common issues that arise from their use.

There is a long-standing folklore position that lenses do not support polymorphic updates. This has actually caused a fair bit of embarrassment for the folks who'd like to incorporate lenses in any Haskell record system improvement.
Access control. It'd be nice to have read-only or write-only properties -- "one-way" or "mirrored" lenses, as it were. Moreover, lenses are commonly viewed as an all or nothing proposition, in that it is hard to mix them with arbitrary user functions.
Finally there is a bit of a cult around trying to generalize lenses by smashing a monad in the middle of them somewhere, it would be nice to be able to get into a list and work with each individual element in it without worrying about someone mucking up our lens laws, and perhaps avoid the whole generalized lens issue entirely.

We'll take a whack at each of these concerns in turn today.

   {-# LANGUAGE Rank2Types #-}  -- we'll relax this later
   import Data.Complex -- for complex examples

First, let us consider the type of van Laarhoven lenses:

type Lens a b =
  forall f. Functor f =>
  (b -> f b) -> a -> f a

with a couple of examples:

 
realLens :: RealFloat a => Lens (Complex a) a
realLens f (r :+ i) = fmap (:+ i) (f r)
 
imagLens :: RealFloat a => Lens (Complex a) a
imagLens f (r :+ i) = fmap (r :+) (f i)

These lenses have some very nice properties that we're going to exploit. By far their nicest property is that you can compose them using just (.) and id from the Prelude rather than having to go off and write a Category.

Lens Families

Russell O'Connor recently noted that these lenses permit polymorphic update if you simply generalize their type signature to

 
type LensFamily a b c d =
  forall f. Functor f =>
  (c -> f d) -> a -> f b

I'd like to note that you can't just let these 4 arguments vary with complete impunity, so I'll be referring to these as "lens families" rather than polymorphic lenses, a point that I'll address further below. In short, we want the original lens laws to still hold in spite of the generalized type signature, and this forces some of these types to be related.

As an aside, each of the other lens types admit this same generalization! For instance the Lens type in data-lens can be generalized using an indexed store comonad:

 
data Store c d b = Store (d -> b) c
 
instance Functor (Store c d) where
  fmap f (Store g c) = Store (f . g) c
 
newtype DataLensFamily a b c d = DataLensFamily (a -> Store c d b)

and we can freely convert back and forth to van Laarhoven lens families:

 
dlens :: LensFamily a b c d -> DataLensFamily a b c d
dlens l = DataLensFamily (l (Store id))
 
plens :: DataLensFamily a b c d -> LensFamily a b c d
plens (DataLensFamily l) f a = case l a of
  Store g c -> fmap g (f c)

I leave it as an exercise to the reader to generalize the other lens types, but we'll stick to van Laarhoven lens families almost exclusively below.

As Russell noted, we can define functions to get, modify and set the target of a lens very easily. I'll create local names for Identity and Const, mostly to help give nicer error messages later.

We can read from a lens family:

 
infixl 8 ^.
newtype Getting b a = Getting { got :: b }
instance Functor (Getting b) where
    fmap _ (Getting b) = Getting b
(^.) :: a -> ((c -> Getting c d) -> a -> Getting c b) -> c
x ^. l = got (l Getting x)

We can modify the target of the lens:

 
newtype Setting a = Setting { unsetting :: a }
instance Functor Setting where
    fmap f (Setting a) = Setting (f a)
infixr 4 %=
(%=) :: ((c -> Setting d) -> a -> Setting b) -> (c -> d) -> a -> b
l %= f = unsetting . l (Setting . f)

We can set the target of the lens with impunity:

 
infixr 4 ^=
(^=) :: ((c -> Setting d) -> a -> Setting b) -> d -> a -> b
l ^= v = l %= const v

We can build a lens family from a getter/setter pair

 
lens :: (a -> c) -> (a -> d -> b) -> LensFamily a b c d
lens f g h a = fmap (g a) (h (f a))

or from a family of isomorphisms:

 
iso :: (a -> c) -> (d -> b) -> LensFamily a b c d
iso f g h a = fmap g (h (f a))

With these combinators in hand, we need some actual lens families to play with. Fortunately they are just as easy to construct as simple lenses. The only thing that changes is the type signature.

 
fstLens :: LensFamily (a,c) (b,c) a b
fstLens f (a,b) = fmap (\x -> (x,b)) (f a)
 
sndLens :: LensFamily (a,b) (a,c) b c
sndLens f (a,b) = fmap ((,) a) (f b)
 
swap :: (a,b) -> (b,a)
swap (a,b) = (b,a)
 
swapped :: LensFamily (a,b) (c,d) (b,a) (d,c)
swapped = iso swap swap

These can also build 'traditional' lenses:

 
negated :: Num a => Lens a a
negated = iso negate negate

And since Lens and LensFamily are both type aliases, we can freely mix and match lenses with lens families:

 
ghci> (1:+2,3) ^.fstLens.realLens
1.0
ghci> fstLens . realLens ^= 4 $ (1:+2,3)
(4.0 :+ 2.0,3)

But, we can now change types with our lens updates!

 
ghci> (fstLens . sndLens ^= "hello") ((1,()),3)
((1,"hello"),3)

We can even do things like use the combinator

 
traverseLens :: ((c -> c) -> a -> b) -> a -> b
traverseLens f = f id

to project a Functor out through an appropriate lens family:

 
ghci> :t traverseLens (fstLens . sndLens)
traverseLens (fstLens . sndLens)
  :: Functor f => ((a, f b), c) -> f ((a, b), c)

That takes care of polymorphic updates.

Why is it a Lens Family?

So, why do I use the term "lens family" rather than "polymorphic lens"?

In order for the lens laws to hold, the 4 types parameterizing our lens family must be interrelated.

In particular you need to be able to put back (with ^=) what you get out of the lens (with ^.) and put multiple times.

This effectively constrains the space of possible legal lens families to those where there exists an index kind i, and two type families outer :: i -> *, and inner :: i -> *. If this were a viable type signature, then each lens family would actually have 2 parameters, yielding something like:

 
-- pseudo-Haskell
-- type LensFamily outer inner =
--    forall a b. LensFamily (outer a) (outer b) (inner a) (inner b)

but you can't pass in type families as arguments like that, and even if you could, their lack of injectivity doesn't give the type checker enough to work with to compose your lenses. By specifying all 4 type arguments independently, we give the compiler enough to work with. But since the arguments aren't just freely polymorphic and are instead related by these index types, I'm choosing to call them "lens families" rather than "polymorphic lenses".

Getters

Note, we didn't use the full polymorphism of the van Laarhoven lenses in the signatures of (^.), (%=) and (^=) above.

What happens when we restrict the type of Functor we're allowed to pass to our lens?

If we generalize the type of our getter ever so slightly from the type we pass to (^.) to permit composition, we get:

 
type Getter a c = forall r d b. (c -> Getting r d) -> a -> Getting r b

and we can make getters out of arbitrary Haskell functions that we have lying around with

 
-- | build a getting out of a function
getting :: (a -> b) -> Getter a b
getting g f = Getting . got . f . g

For example:

 
getFst :: Getter (a,b) a
getFst = getting fst
 
getSnd :: Getter (a,b) b
getSnd = getting snd

But this is particularly nice for things that can't be made into real lenses or lens families, because of loss of information:

 
getPhase :: RealFloat a => Getter (Complex a) a
getPhase = getting phase
 
getAbs, getSignum  :: Num a => Getter a a
getAbs = getting abs
getSignum = getting signum

Notably, getMagnitude and getPhase can't be legal lenses because when the magnitude is 0, you lose phase information.

These can be mixed and matched with other lenses when dereferencing with (^.)

 
ghci> (0,(1:+2,3)) ^. getting snd . fstLens . getting magnitude
2.23606797749979

But we get a type error when we attempt to write to a Getter.

 
ghci> getting magnitude ^= 12
:2:1:
    Couldn't match expected type `Setting d0'
                with actual type `Getting r0 d1'
    Expected type: (c0 -> Setting d0) -> a1 -> Setting b1
      Actual type: (c0 -> Getting r0 d1) -> a0 -> Getting r0 b0
    In the return type of a call of `getting'
    In the first argument of `(^=)', namely `getting magnitude'

Setters

So what about write-only properties?

These have a less satisfying solution. We have to break our lens family structure slightly to make something that can strictly only be written to, by disabling the ability to read our current value entirely.

 
type Setter a d b = (() -> Setting d) -> a -> Setting b
 
setting :: (a -> d -> b) -> Setter a d b
setting f g a = Setting (f a (unsetting (g ())))

Now we can make setters out of functions that take two arguments:

 
plus, times :: Num a => Setter a a a
plus = setting (+)
times = setting (*)

 
ghci> setting (+) ^= 12 $ 32
44
ghci> fstLens . setting (*) ^= 12 $ (2,3)
(24,3)

However, these lenses have the unsatisfying property that they can only be placed last in the chain of lenses we're setting.

 
ghci> (setting (+) . realLens ^= 12) 1
:15:16:
    Couldn't match expected type `()' with actual type `Complex d0'
    Expected type: (d0 -> Setting d0) -> () -> Setting b0
      Actual type: (d0 -> Setting d0)
                   -> Complex d0 -> Setting (Complex d0)
    In the second argument of `(.)', namely `realLens'
    In the first argument of `(^=)', namely `setting (+) . realLens'

This isn't surprising, if you consider that to compose data-lens lenses you need to use %= to chain setters.

Modifiers

So what do we need to do to make a lens we can only modify but not read?

Lets restore the lens family structure!

 
type Modifier a b c d = (c -> Setting d) -> a -> Setting b
 
modifying :: ((c -> d) -> a -> b) -> Modifier a b c d
modifying f g a = Setting (f (unsetting . g) a)

modifying makes a modify-only lens family you can modify using local information, but can't tell anyone about the contents of.

This lets us work with a lens over a variable number of elements in a structure, without worrying about a user accidentally "putting back" too many or too few entries.

 
ghci> modifying map %= (+1) $ [1,2,3]
[2,3,4]

They can be composed with other lenses:

 
ghci> modifying map . sndLens %= (+1) $ [("hello",1),("goodbye",2)]
[("hello",2),("goodbye",3)]

and unlike with a Setter, you can compose a Modifier with a Modifier:

 
modifying fmap . modifying fmap
  :: (Functor g, Functor f) =>
     (c -> Setting d) -> f (g c) -> Setting (f (g d))

but they cannot be read from directly:

 
ghci> [1,2,3] ^. modifying fmap
:18:12:
    Couldn't match expected type `Getting c0 d0'
                with actual type `Setting d1'
    Expected type: (c0 -> Getting c0 d0) -> [t0] -> Getting c0 b1
      Actual type: Modifier a0 b0 c0 d1
    In the return type of a call of `modifying'
    In the second argument of `(^.)', namely `modifying map'

We can map over restricted domains:

 
reals :: (RealFloat a, RealFloat b) => Modifier (Complex a) (Complex b) a b
reals = modifying (\f (r :+ i) -> f r :+ f i)

and everything still composes:

 
ghci> reals %= (+1) $  1 :+ 2
2 :+ 3
ghci> fstLens . reals %= (+1) $ (1 :+ 2, 4)
(2.0 :+ 3.0,4)

These aren't limited to actions that map over the entire structure, however!

 
ghci> :m + Data.Lens
ghci> modifying (`adjust` "goodbye") %= (+1) $
      fromList [("hello",1),("goodbye",2)]
fromList [("goodbye",3),("hello",1)]

This lets us update potentially nested structures where something may or may not be present , which was fairly tedious to do with earlier lens representations.

Both the former map-like example and the latter update-like behavior were commonly used examples in calls for partial lenses or 'multi-lenses', but here they are able to implemented using a restricted form of a more traditional lens type, and moreover they compose cleanly with other lenses and lens families.

Rank-1 Lens Families

At the very start I mentioned that you can dispense with the need for Rank-2 Types. Doing so requires much more tedious type signatures as the LensFamily, Getter, Setter and Lens aliases are no longer legal. Also, if you want to take a lens as an argument and use it in multiple contexts (e.g. as both a getter and a setter), you'll need to clone it to obtain a lens family. For example, this fails:

 
ghci> :t \l y -> l ^= y ^. l + 1 $ y
:1:19:
    Couldn't match expected type `Getting d0 d1'
                with actual type `Setting d0'
    Expected type: (d0 -> Getting d0 d1) -> a1 -> Getting d0 b1
      Actual type: (d0 -> Setting d0) -> a0 -> Setting b0
    In the second argument of `(^.)', namely `l'
    In the first argument of `(+)', namely `y ^. l'

But we can clone the supplied monomorphic lens using the composition of dlens and plens above, since the DataLensFamily completely characterizes the LensFamily with:

 
clone ::
  ((c -> Store c d d) -> (a -> Store c d b)) ->
  LensFamily a b c d
clone l f a = case l (Store id) a of
  Store g c -> fmap g (f c)

and then the following code type checks:

 
ghci> :t \l y -> clone l ^= y ^. clone l + 1 $ y
\l y -> clone l ^= y ^. clone l + 1 $ y
  :: Num d => ((c -> Store c d1 d1) -> a -> Store d d b) -> a -> b

This means you could implement an entire library to deal with lens families with restricted getters and setters and remain within the confines of Haskell 98. However, the type signatures are considerably less elegant than what becomes available when you simply add Rank2Types.

Conclusion

So, we've demonstrated that van Laarhoven lens families let you have lenses that permit polymorphic update, let you offer lenses that are restricted to only allowing the use of getters, setters or modifiers, while granting you easy composition with the existing (.) and id from the Prelude.

I think the practical existence and power of these combinators make a strong case for their use in any serious record reform proposal.

My thanks go to Russell O'Connor. He first noticed that you can generalize van Laarhoven lenses and proposed the clone combinator as a path to Haskell 98/2010 compatibility, while retaining the nicer composition model.

Searching Infinity Parametrically

Edward Kmett — Sun, 25 Dec 2011 06:19:43 +0000

Andrej Bauer recently gave a really nice talk on how you can exploit side-effects to make a faster version of Martin Escardo's pseudo-paradoxical combinators.

A video of his talk is available over on his blog, and his presentation is remarkably clear, and would serve as a good preamble to the code I'm going to present below.

Andrej gave a related invited talk back at MSFP 2008 in Iceland, and afterwards over lunch I cornered him (with Dan Piponi) and explained how you could use parametricity to close over the side-effects of monads (or arrows, etc) but I think that trick was lost in the chaos of the weekend, so I've chosen to resurrect it here, and improve it to handle some of his more recent performance enhancements, and show that you don't need side-effects to speed up the search after all!

First, we'll need to import a few things:

 
{-# LANGUAGE RankNTypes #-}
 
import Data.Maybe (fromMaybe)
import Control.Applicative
import Data.IntMap (IntMap)
import qualified Data.IntMap as IntMap
import Control.Monad
import Control.Monad.Trans.Class
import Data.Functor.Identity

What are looking for is an implementation of Hilbert's epsilon.

This is a formal mechanism for eliminating existentials over some non-empty set X by defining a function

 
ε: (X -> Prop) -> X

such that if there exists an x in X such that p(X) holds then p(ε(p)) holds.

As noted by Andrej, we could reify this constructively as a function "epsilon :: (X -> Bool) -> X" for some X.

Now, for some sets, Hilbert's epsilon is really easy to define. If X is a finite set, you can just exhaustively enumerate all of the options returning a member of X such that the property holds if any of them do, otherwise since X is non-empty, just return one of the elements that you tested.

This would be a pretty boring article and I'd be back to eating Christmas dinner with my family if that was all there was to it. However, certain infinite spaces can also be searched.

Last year, Luke Palmer wrote a post on "Searchable Data Types" that might also serve as a good introduction. In that article he led off with the easiest infinite space to search, the lazy naturals, or the 'one point compactification of the naturals'. That is to say the natural numbers extended with infinity.

 
data LazyNat = Zero | Succ LazyNat
infinity :: LazyNat
infinity = Succ infinity

Now we can implement Palmer's epsilon (called lyingSearch in his article).

 
palmer :: (LazyNat -> Bool) -> LazyNat
palmer p
  | p Zero = Zero
  | otherwise = Succ $ palmer $ p . Succ

The trick to making this work is that we place a requirement that the predicate that you pass has to terminate in a bounded amount of time no matter what input you give it, and since we're working with the naturals extended with infinity, if no natural satisfies the predicate, we'll just keep returning a longer and longer chain of Succ's, effectively yielding infinity.

To check to see if the returned number satisfies the predicate you can always use p (palmer p). The predicate is required to terminate in finite time, even when given infinity, so this will yield a Bool and not bottom out unless the user supplied predicate takes an unbounded amount of time.

I posted a reply to Luke's article when it came up on reddit which included a Hinze-style generic implementation of his lyingSearch predicate, which you can see now is just Hilbert's epsilon for arbitrary recursive polynomial data types.

http://www.reddit.com/r/haskell/comments/e7nij/searchable_data_types/c15zs6l

Another space we can search is the Cantor space 2^N.

 
type Cantor = Integer -> Bool

With that we jump clear from countable infinity to uncountable infinity, but it can still be searched in finite time!

This is the space we'll be paying attention to for the rest of this article.

First we'll define how to "book a room in Hilbert's Hotel."

 
infixr 0 #
(#) :: Bool -> Cantor -> Cantor
(x # a) 0 = x
(x # a) i = a (i - 1)

Then this can be used to obtain the following implementation of Hilbert's epsilon for the Cantor space, attributed by Andrej to Ulrich Berger.

 
berger :: (Cantor -> Bool) -> Cantor
berger p =
  if ex $ \a -> p $ False # a
  then False # berger $ \a -> p $ False # a
  else True  # berger $ \a -> p $ True # a
  where ex q = q (berger q)

This version is particularly close in structure to the one for searching the LazyNats, but it is dreadfully slow!

It would be nice to be able to search the space faster and that is just what Martin Escardo's improved version does, through a more sophisticated divide and conquer technique.

 
escardo :: (Cantor -> Bool) -> Cantor
escardo p = go x l r where
  go x l r n =  case divMod n 2 of
    (0, 0) -> x
    (q, 1) -> l q
    (q, 0) -> r $ q-1
  x = ex $ \l -> ex $ \r -> p $ go True l r
  l = escardo $ \l -> ex $ \r -> p $ go x l r
  r = escardo $ \r -> p $ go x l r
  ex q = q (escardo q)

To proceed from here I'll need a State monad:

 
newtype S s a = S { runS :: s -> (a, s) }
 
instance Functor (S s) where
  fmap f (S m) = S $ \s -> case m s of
    (a, s') -> (f a, s')
 
instance Applicative (S s) where
  pure = return
  (< *>) = ap
 
instance Monad (S m) where
  return a = S $ \s -> (a, s)
  S m >>= k = S $ \s -> case m s of
    (a, s') -> runS (k a) s'

And now we've reached the point. From here, Andrej's pure code ends, and his side-effecting ocaml and custom programming language start. The first thing he does is compute the modulus of continuity by using a side-effect that writes to a reference cell which he very carefully ensures doesn't leak out of scope, so he doesn't have to concern himself with the proposition code editing the value of the reference.

 
let mu f a =
  let r = ref 0 in
  let b n = (r := max n ! r; a n) in
    ignore (f b);
    !r

To obtain the same effect we'll instead make a predicate using the state monad to model the single reference cell.

 
-- bad
modulus :: (Num b, Ord s) =>
  ((s -> S s a) -> S b c) -> (s -> a) -> b

We can mash b and s together, and try to make the ordering and number agree by claiming that it is instead Real and we'd get the slightly more reasonable looking type:

 
-- still bad
modulus :: Real a =>
  ((a -> S n b) -> S n c) -> (a -> b) -> a

In the imperative code, lexical scoping had ensured that no other code could edit the reference cell, but with this type we don't have that. The predicate is allowed to use arbitrary state actions to muck with the modulus of convergence even though the only thing that should be editing it is the wrapper beta that we placed around alpha.

But how can we ensure that the end user couldn't gain access to any of the additional functionality from the monad? Parametricity!

 
-- getting better
modulus :: Real a =>
  (forall f. Monad f => (a -> f b) -> f c) ->
  (a -> b) ->
  a

Here the only thing you are allowed to assume about f is that it forms a monad. This gives you access to return and >>=, but the predicate can't do anything interesting with them. All it can do is work with what is effectively the identity monad, since it knows no additional properties!

We can have mercy on the end user and give them a little bit more syntactic sugar, since it doesn't cost us anything to let them also have access to the Applicative instance.

 
-- good
modulus :: Real a =>
  (forall f. (Monad f, Applicative f) => (a -> f b) -> f c) ->
  (a -> b) ->
  a

With that we can show Andrej's version of the modulus of convergence calculation does not need side-effects!

>
> modulus (\a -> a 10 >>= a) (\n -> n * n)
100

Admittedly plumbing around the monadic values in our proposition is a bit inconvenient.

His next example was written in a custom ocaml-like programming language. For translating his effect type into Haskell using parametricity, we'll need a CPS'd state monad, so we can retry from the current continuation while we track a map of assigned values.

 
newtype K r s a = K { runK :: (a -> s -> r) -> s -> r }
 
instance Functor (K r s) where
  fmap = liftM
 
instance Applicative (K r s) where
  pure = return
  (< *>) = ap
 
instance Monad (K r s) where
  return a = K $ \k -> k a
  K m >>= f = K $ \k -> m $ \a -> runK (f a) k

For those of you who have been paying attention to my previous posts, K r s is just a Codensity monad!

 
neighborhood ::
  (forall f. (Applicative f, Monad f) => (Int -> f Bool) -> f Bool) ->
  IntMap Bool
neighborhood phi = snd $ runK (phi beta) (,) IntMap.empty where
  beta n = K $ \k s -> case IntMap.lookup n s of
    Just b -> k b s
    Nothing -> case k True (IntMap.insert n True s) of
      (False, _) -> k False (IntMap.insert n False s)
      r -> r

With that we can adapt the final version of Hilbert's epsilon for the Cantor space that Andrej provided to run in pure Haskell.

 
bauer ::
  (forall f. (Applicative f, Monad f) => (Int -> f Bool) -> f Bool) ->
  Cantor
bauer p = \n -> fromMaybe True $ IntMap.lookup n m where
  m = neighborhood p

With a little work you can implement a version of an exists and forAll predicate on top of that by running them through the identity monad.

 
exists ::
  (forall f. (Applicative f, Monad f) => (Int -> f Bool) -> f Bool) ->
  Bool
forAll ::
  (forall f. (Applicative f, Monad f) => (Int -> f Bool) -> f Bool) ->
  Bool

I've gone further in playing with this idea, using monad homomorphisms rather than simply relying on the canonical homomorphism from the identity monad. You can get the gist of it here:

https://gist.github.com/1518767

This permits the predicates themselves to embed some limited monadic side-effects, but then you get more extensional vs. intensional issues.

An obvious direction from here is to fiddle with a version of Martin Escardo's search monad that takes advantage of these techniques, but I'll leave the exploration of these ideas to the reader for now and go enjoy Christmas dinner.

Happy Holidays,
Edward Kmett

A Parsec Full of Rats, Part 2

Edward Kmett — Sat, 24 Sep 2011 03:07:32 +0000

Last time, I showed that we can build a small parsec clone with packrat support.

This time I intend to implement packrat directly on top of Parsec 3.

One of the main topics of discussion when it comes to packrat parsing since Bryan Ford's initial release of Pappy has been the fact that in general you shouldn't use packrat to memoize every rule, and that instead you should apply Amdahl's law to look for the cases where the lookup time is paid back in terms of repetitive evaluation, computation time and the hit rate. This is great news for us, since, we only want to memoize a handful of expensive combinators.

First, we'll need to import enough of Parsec to do something interesting.

 
{-# LANGUAGE RecordWildCards, ViewPatterns, FlexibleInstances, MultiParamTypeClasses #-}
 
import Text.Parsec
import qualified Text.Parsec.Token as T
import Text.Parsec.Token
    (GenLanguageDef(..), GenTokenParser(TokenParser))
import Text.Parsec.Pos (initialPos, updatePosChar)
import Data.Functor.Identity (Identity(..))
import Control.Applicative hiding ((< |>))
import Control.Monad.Fix (fix)

Then as before, we'll define PEG-style backtracking:

 
(< />) :: Monad m => ParsecT s u m a -> ParsecT s u m a ->
    ParsecT s u m a
p < /> q = try p < |> q
infixl 3 < />

Now we need an analogue to our Result type from last time, which recalled whether or not we had consumed input, and what the current cursor location is. Fortunately, we can recycle the definitions from Parsec to this end.

 
type Result d a = Consumed (Reply d () a)

We'll define a combinator to build a parser directly from a field accessor. Last time, this was just the use of the "Rat" constructor. Now it is a bit trickier, because we need to turn Consumed (Reply d () a) into m (Consumed (m (Reply d u a))) by wrapping it in the appropriate monad, and giving the user back his state unmolested.

 
rat :: Monad m => (d -> Result d a) -> ParsecT d u m a
rat f   = mkPT $ \s0 -> return $
    return . patch s0 < $> f (stateInput s0) where
  patch (State _ _ u) (Ok a (State s p _) err) = Ok a (State s p u) err
  patch _             (Error e)                = Error e

Last time we could go from a parser to a result just by applying the user stream type, but with parsec we also have to supply their notion of a position. This leads to the following combinator. By running in the Identity monad with no user state it should be obvious that we've duplicated the functionality of the previous 'Rat' parser (with the addition of a source position).

 
womp :: d -> SourcePos -> ParsecT d () Identity a -> Result d a
womp d pos p = fmap runIdentity . runIdentity $
    runParsecT p (State d pos ())

The combinator is so named because we needed a big space-rat rather than a little pack-rat to keep with the theme.

It's not impossible. I used to bullseye womp rats in my T-16 back home, they're not much bigger than two meters.

Now we'll write a bit of annoyingly verbose boilerplate to convince Parsec that we really want a LanguageDef for some monad other than Identity. (As an aside, why Text.Parsec.Language doesn't contain GenLanguageDefs that are parametric in their choice of Monad is beyond me.)

 
myLanguageDef :: Monad m => T.GenLanguageDef D u m
myLanguageDef = T.LanguageDef
  { commentStart    = "{-"
  , commentEnd      = "-}"
  , commentLine     = "--"
  , nestedComments  = True
  , identStart      = letter < |> char '_'
  , identLetter     = alphaNum < |> oneOf "_'"
  , opStart         = opLetter myLanguageDef
  , opLetter        = oneOf ":!#$%&*+./< =>?@\\^|-~"
  , reservedOpNames = []
  , reservedNames   = []
  , caseSensitive   = True
  }

As a shameless plug, trifecta offers a particularly nice solution to this problem, breaking up the monolithic Token type into separate concerns and letting you layer parser transformers that enrich the parser to deal with things like Haskell-style layout, literate comments, parsing comments in whitespace, etc.

And as one last bit of boilerplate, we'll abuse RecordWildcards once again to avoid the usual 20 lines of boilerplate that are expected of us, so we can get access to parsec's token parsers.

 
TokenParser {..} = T.makeTokenParser myLanguageDef

Now we're ready to define our incredibly straightforward stream type:

 
data D = D
  { _add        :: Result D Integer
  , _mult       :: Result D Integer
  , _primary    :: Result D Integer
  , _dec        :: Result D Integer
  , _uncons     :: Maybe (Char, D)
  }
 
instance Monad m => Stream D m Char where
  uncons = return . _uncons

And using the general purpose rat combinator from earlier, we can write some memoized parsers:

 
add, mult, primary, dec :: Parsec D u Integer
add     = rat _add
mult    = rat _mult
primary = rat _primary
dec     = rat _dec

And finally, we write the code to tie the knot and build the stream:

 
parse :: SourceName -> String -> D
parse n = go (initialPos n) where
  go p s = fix $ \d -> let
    (womp d p -> _add) =
            (+) < $> mult < * reservedOp "+" <*> add
        < /> mult < ?> "summand"
    (womp d p -> _mult) =
            (*) < $> primary < * reservedOp "*" <*> mult
        < /> primary < ?> "factor"
    (womp d p -> _primary) =
            parens add
        < /> dec < ?> "number"
    (womp d p -> _dec) = natural
    _uncons = case s of
      (x:xs) -> Just (x, go (updatePosChar p x) xs)
      []     -> Nothing
    in D { .. }
 
runD :: Parsec D u a -> u -> SourceName -> String -> Either ParseError a
runD p u fn s = runParser p u fn (prep fn s)

and finally, let it rip:

 
eval :: String -> Integer
eval s = either (error . show) id $
    runD (whiteSpace *> add < * eof) () "-" s

While this approach tends to encourage memoizing fewer combinators than libraries such as frisby, this is exactly what current research suggests you probably should do with packrat parsing!

The other purported advantage of packrat parsers is that they can deal with left recursion in the grammar. However, that is not the case, hidden left recursion in the presence of the algorithm used in the scala parsing combinator libraries leads to incorrect non-left-most parses as shown by Tratt.

I leave it as an exercise for the reader to extend this material with the parsec+iteratees approach from my original talk on trifecta to get packrat parsing of streaming input. Either that or you can wait until it is integrated into trifecta.

You can download the source to this (without the spurious spaces inserted by wordpress) here.

If I can find the time, I hope to spend some time addressing Scott and Johnstone's GLL parsers, which actually achieve the O(n^3) worst case bounds touted for Tomita's GLR algorithm (which is actually O(n^4) as it was originally defined despite the author's claims), and how to encode them in Haskell with an eye towards building a memoizing parser combinator library that can parse LL(1) fragments in O(1), deal with arbitrary context-free grammars in O(n^3), and degrade reasonably gracefully in the presence of context-sensitivity, while supporting hidden left recursion as long as such recursion passes through at least one memoized rule. This is important because CFGs are closed under extensions to the grammar, which is a nice property to have if you want to have a language where you can add new statement types easily without concerning yourself overmuch with the order in which you insert the rules or load the different extensions.

A Parsec Full of Rats, Part 1

Edward Kmett — Sat, 24 Sep 2011 02:10:06 +0000

You never heard of the Millenium Falcon? It's the ship that made the Kessel Run in 12 parsecs.

I've been working on a parser combinator library called trifecta, and so I decided I'd share some thoughts on parsing.

Packrat parsing (as provided by frisby, pappy, rats! and the Scala parsing combinators) and more traditional recursive descent parsers (like Parsec) are often held up as somehow different.

Today I'll show that you can add monadic parsing to a packrat parser, sacrificing asymptotic guarantees in exchange for the convenient context sensitivity, and conversely how you can easily add packrat parsing to a traditional monadic parser combinator library.

To keep this post self-contained, I'm going to start by defining a small packrat parsing library by hand, which acts rather like parsec in its backtracking behavior. First, some imports:

 
{-# LANGUAGE RecordWildCards, ViewPatterns, DeriveFunctor #-}
import Control.Applicative
import Control.Monad (MonadPlus(..), guard)
import Control.Monad.Fix (fix)
import Data.Char (isDigit, digitToInt, isSpace)

Second, we'll define a bog simple parser, which consumes an input stream of type d, yielding a possible answer and telling us whether or not it has actually consumed any input as it went.

 
newtype Rat d a = Rat { runRat :: d -> Result d a }
  deriving Functor
 
data Result d a
  = Pure a             -- didn't consume anything, can backtrack
  | Commit d a      -- consumed input
  | Fail String Bool -- failed, flagged if consumed
  deriving Functor

Now, we can finally implement some type classes:

 
instance Applicative (Rat d) where
  pure a = Rat $ \ _ -> Pure a
  Rat mf < *> Rat ma = Rat $ \ d -> case mf d of
    Pure f      -> fmap f (ma d)
    Fail s c    -> Fail s c
    Commit d' f -> case ma d' of
      Pure a       -> Commit d' (f a)
      Fail s _     -> Fail s True
      Commit d'' a -> Commit d'' (f a)

including an instance of Alternative that behaves like parsec, only backtracking on failure if no input was unconsumed.

 
instance Alternative (Rat d) where
  Rat ma < |> Rat mb = Rat $ \ d -> case ma d of
    Fail _ False -> mb d
    x            -> x
  empty = Rat $ \ _ -> Fail "empty" False

For those willing to forego the asymptotic guarantees of packrat, we'll offer a monad.

 
instance Monad (Rat d) where
  return a = Rat $ \_ -> Pure a
  Rat m >>= k = Rat $ \d -> case m d of
    Pure a -> runRat (k a) d
    Commit d' a -> case runRat (k a) d' of
      Pure b -> Commit d' b
      Fail s _ -> Fail s True
      commit -> commit
    Fail s c -> Fail s c
  fail s = Rat $ \ _ -> Fail s False
 
instance MonadPlus (Rat d) where
  mplus = (< |>)
  mzero = empty

and a Parsec-style "try", which rewinds on failure, so that < |> can try again.

 
try :: Rat d a -> Rat d a
try (Rat m) = Rat $ \d -> case m d of
  Fail s _ -> Fail s False
  x        -> x

Since we've consumed < |> with parsec semantics. Let's give a PEG-style backtracking (< />).

 
(< />) :: Rat d a -> Rat d a -> Rat d a
p < /> q = try p < |> q
infixl 3 < />

So far nothing we have done involves packrat at all. These are all general purpose recursive descent combinators.

We can define an input stream and a number of combinators to read input.

 
class Stream d where
  anyChar :: Rat d Char
 
whiteSpace :: Stream d => Rat d ()
whiteSpace = () < $ many (satisfy isSpace)
phrase :: Stream d => Rat d a -> Rat d a
phrase m = whiteSpace *> m < * eof
 
notFollowedBy :: Rat d a -> Rat d ()
notFollowedBy (Rat m) = Rat $ \d -> case m d of
  Fail{} -> Pure ()
  _      -> Fail "unexpected" False
 
eof :: Stream d => Rat d ()
eof = notFollowedBy anyChar
 
satisfy :: Stream d => (Char -> Bool) -> Rat d Char
satisfy p = try $ do
  x < - anyChar
  x <$ guard (p x)
 
char :: Stream d => Char -> Rat d Char
char c = satisfy (c ==)
 
lexeme :: Stream d => Rat d a -> Rat d a
lexeme m = m < * whiteSpace
 
symbol :: Stream d => Char -> Rat d Char
symbol c = lexeme (char c)
 
digit :: Stream d => Rat d Int
digit = digitToInt < $> satisfy isDigit

And we can of course use a string as our input stream:

 
instance Stream [Char] where
  anyChar = Rat $ \s -> case s of
    (x:xs) -> Commit xs x
    [] -> Fail "EOF" False

Now that we've built a poor man's Parsec, let's do something more interesting. Instead of just using String as out input stream, let's include slots for use in memoizing the results from our various parsers at each location. To keep things concrete, we'll memoize the ArithPackrat.hs example that Bryan Ford used in his initial packrat presentation enriched with some whitespace handling.

 
data D = D
  { _add        :: Result D Int
  , _mult       :: Result D Int
  , _primary    :: Result D Int
  , _decimal    :: Result D Int
  , anyCharD    :: Result D Char
  }

If you look at the type of each of those functions you'll see that _add :: D -> Result D Int, which is exactly our Rat newtype expects as its argument, we we can bundle them directly:

 
 
add, mult, primary, decimal :: Rat D Int
add     = Rat _add
mult    = Rat _mult
primary = Rat _primary
decimal = Rat _decimal

We can similarly juse use the character parse result.

 
instance Stream D where
  anyChar = Rat anyCharD

Now we just need to build a D from a String. I'm using view patterns and record wildcards to shrink the amount of repetitive naming.

 
parse :: String -> D
parse s = fix $ \d -> let
  Rat (dv d -> _add) =
        (+) < $> mult < * symbol '+' <*> add
     < /> mult
  Rat (dv d -> _mult) =
        (*) < $> primary < * symbol '*' <*> mult
    < /> primary
  Rat (dv d -> _primary) =
        symbol '(' *> add < * symbol ')'
     decimal
  Rat (dv d -> _decimal) =
     foldl' (\b a -> b * 10 + a) 0 < $> lexeme (some digit)
  anyCharD = case s of
    (x:xs) -> Commit (parse xs) x
    []     -> Fail "EOF" False
  in D { .. }
 
dv :: d -> (d -> b) -> b
dv d f = f d

Note that we didn't really bother factoring the grammar, since packrat will take care of memoizing the redundant calls!

And with that, we can define an evaluator.

 
eval :: String -> Int
eval s = case runRat (whiteSpace *> add < * eof) (parse s) of
  Pure a -> a
  Commit _ a -> a
  Fail s _ -> error s

Note that because the input stream D contains the result directly and parse is the only thing that ever generates a D, and it does so when we start up, it should be obvious that the parse results for each location can't depend on any additional information smuggled in via our monad.

Next time, we'll add a packratted Stream type directly to Parsec, which will necessitate some delicate handling of user state.

The small parser implemented here can be found on my github account, where it hasn't been adulterated with unnecessary spaces by my blog software.

P.S. To explain the quote, had I thought of it earlier, I could have named my parsing combinator library "Kessel Run" as by the time I'm done with it "it will contain at least 12 parsecs" between its different parser implementations.

Free Modules and Functional Linear Functionals

Edward Kmett — Mon, 11 Jul 2011 20:58:04 +0000

Today I hope to start a new series of posts exploring constructive abstract algebra in Haskell.

In particular, I want to talk about a novel encoding of linear functionals, polynomials and linear maps in Haskell, but first we're going to have to build up some common terminology.

Having obtained the blessing of Wolfgang Jeltsch, I replaced the algebra package on hackage with something... bigger, although still very much a work in progress.

(Infinite) Modules over Semirings

Recall that a vector space V over a field F is given by an additive Abelian group on V, and a scalar multiplication operator
(.*) :: F -> V -> V subject to distributivity laws

 
s .* (u + v) = s .* u + s .* v
(s + t) .* v = s .* v + t .* v

and associativity laws

 
   (s * t) .* v = s .* (t .* v)

and respect of the unit of the field.

 
   1 .* v = v

Since multiplication on a field is commutative, we can also add

 
  (*.) :: V -> F -> V
  v *. f = f .* v

with analogous rules.

But when F is only a Ring, we call the analogous structure a module, and in a ring, we can't rely on the commutativity of multiplication, so we may have to deal left-modules and right-modules, where only one of those products is available.

We can weaken the structure still further. If we lose the negation in our Ring we and go to a Rig (often called a Semiring), now our module is an additive moniod.

If we get rid of the additive and multiplicative unit on our Rig we get down to what some authors call a Ringoid, but which we'll call a Semiring here, because it makes the connection between semiring and semigroup clearer, and the -oid suffix is dangerously overloaded due to category theory.

First we'll define additive semigroups, because I'm going to need both additive and multiplicative monoids over the same types, and Data.Monoid has simultaneously too much and too little structure.

 
-- (a + b) + c = a + (b + c)
class Additive m where
  (+) :: m -> m -> m
  replicate1p :: Whole n => n -> m -> m -- (ignore this for now)
  -- ...

their Abelian cousins

 
-- a + b = b + a
class Additive m => Abelian m

and Multiplicative semigroups

 
-- (a * b) * c = a * (b * c)
class Multiplicative m where
  (*) :: m -> m -> m
  pow1p :: Whole n => m -> n -> m
  -- ...

Then we can define a semirings

 
-- a*(b + c) = a*b + a*c
-- (a + b)*c = a*c + b*c
class (Additive m, Abelian m, Multiplicative m) => Semiring

With that we can define modules over a semiring:

 
-- r .* (x + y) = r .* x + r .* y
-- (r + s) .* x = r .* x + s .* x
-- (r * s) .* x = r .* (s .* x)
class (Semiring r, Additive m) => LeftModule r m
   (.*) :: r -> m -> m

and analogously:

 
class (Semiring r, Additive m) => RightModule r m
   (*.) :: m -> r -> m

For instance every additive semigroup forms a semiring module over the positive natural numbers (1,2..) using replicate1p.

If we know that our addition forms a monoid, then we can form a module over the naturals as well

 
-- | zero + a = a = a + zero
class
    (LeftModule Natural m,
    RightModule Natural m
    ) => AdditiveMonoid m where
   zero :: m
   replicate :: Whole n => n -> m -> m

and if our addition forms a group, then we can form a module over the integers

 
-- | a + negate a = zero = negate a + a
class
    (LeftModule Integer m
    , RightModule Integer m
    ) => AdditiveGroup m where
  negate :: m -> m
  times :: Integral n => n -> m -> m
  -- ...

Free Modules over Semirings

A free module on a set E, is a module where the basis vectors are elements of E. Basically it is |E| copies of some (semi)ring.

In Haskell we can represent the free module of a ring directly by defining the action of the (semi)group pointwise.

 
instance Additive m => Additive (e -> m) where
   f + g = \x -> f x + g x
 
instance Abelian m => Abelian (e -> m)
 
instance AdditiveMonoid m => AdditiveMonoid (e -> m) where
   zero = const zero
 
instance AdditiveGroup m => AdditveGroup (e -> m) where
   f - g = \x -> f x - g x

We could define the following

 
instance Semiring r => LeftModule r (e -> m) where
   r .* f = \x -> r * f x

but then we'd have trouble dealing with the Natural and Integer constraints above, so instead we lift modules

 
instance LeftModule r m => LeftModule r (e -> m) where
   (.*) m f e = m .* f e
 
instance RightModule r m => RightModule r (e -> m) where
   (*.) f m e = f e *. m

We could go one step further and define multiplication pointwise, but while the direct product of |e| copies of a ring _does_ define a ring, and this ring is the one provided by the Conal Elliot's vector-space package, it isn't the most general ring we could construct. But we'll need to take a detour first.

Linear Functionals

A Linear functional f on a module M is a linear function from a M to its scalars R.

That is to say that, f : M -> R such that

 
f (a .* x + y) = a * f x + f y

Consequently linear functionals also form a module over R. We call this module the dual module M*.

Dan Piponi has blogged about these dual vectors (or covectors) in the context of trace diagrams.

If we limit our discussion to free modules, then M = E -> R, so a linear functional on M looks like (E -> R) -> R
subject to additional linearity constraints on the result arrow.

The main thing we're not allowed to do in our function is apply our function from E -> R to two different E's and then multiply the results together. Our pointwise definitions above satisfy those linearity constraints, but for example:

 
bad f = f 0 * f 0

does not.

We could capture this invariant in the type by saying that instead we want

 
newtype LinearM r e =
  LinearM {
    runLinearM :: forall r. LeftModule r m => (e -> m) -> m
  }

we'd have to make a new such type every time we subclassed Semiring. I'll leave further exploration of this more exotic type to another time. (Using some technically illegal module instances we can recover more structure that you'd expect.)

Now we can package up the type of covectors/linear functionals:

 
infixr 0 $*
newtype Linear r a = Linear { ($*) :: (a -> r) -> r }

The sufficiently observant may have already noticed that this type is the same as the Cont monad (subject to the linearity restriction on the result arrow).

In fact the Functor, Monad, Applicative instances for Cont all carry over, and preserve linearity.

(We lose callCC, but that is at least partially due to the fact that callCC has a less than ideal type signature.)

In addition we get a number of additional instances for Alternative, MonadPlus, by exploiting the knowledge that r is ring-like:

 
instance AdditiveMonoid r => Alternative (Linear r a) where
  Linear f < |> Linear g = Linear (f + g)
  empty = Linear zero

Note that the (+) and zero there are the ones defined on functions from our earlier free module construction!

Linear Maps

Since Linear r is a monad, Kleisli (Linear r) forms an Arrow:

 
b -> ((a -> r) ~> r)

where the ~> denotes the arrow that is constrained to be linear.

If we swap the order of the arguments so that

 
(a -> r) ~> (b -> r)

this arrow has a very nice meaning! (See Numeric.Map.Linear)

 
infixr 0 $#
newtype Map r b a = Map { ($#) :: (a -> r) -> (b -> r) }

Map r b a represents the type of linear maps from a -> b. Unfortunately due to contravariance the arguments wind up in the "wrong" order.

 
instance Category (Map r) where
  Map f . Map g = Map (g . f)
  id = Map id

So we can see that a linear map from a module A with basis a to a vector space with basis b effectively consists of |b| linear functionals on A.

Map r b a provides a lot of structure. It is a valid instance of an insanely large number of classes.

Vectors and Covectors

In physics, we sometimes call linear functionals covectors or covariant vectors, and if we're feeling particularly loquacious, we'll refer to vectors as contravariant vectors.

This has to do with the fact that when you change basis, you change map the change over covariant vectors covariantly, and map the change over vectors contravariantly. (This distinction is beautifully captured by Einstein's summation notation.)

We also have a notion of covariance and contravariance in computer science!

Functions vary covariantly in their result, and contravariant in their argument. E -> R is contravariant in E. But we chose this representation for our free modules, so the vectors in our free vector space (or module) are contravariant in E.

 
class Contravariant f where
  contramap :: (a -> b) -> f a -> f b
 
-- | Dual function arrows.
newtype Op a b = Op { getOp :: b -> a } 
 
instance Contravariant (Op a) where
  contramap f g = Op (getOp g . f)

On the other hand (E -> R) ~> R varies covariantly with the change of E.

as witnessed by the fact that it is a Functor.

 
instance Functor (Linear r) where
  fmap f m = Linear $ \k -> m $* k . f

We have lots of classes for manipulating covariant structures, and most of them apply to both (Linear r) and (Map r b).

Other Representations and Design Trade-offs

One common representation of vectors in a free vector space is as some kind of normalized list of scalars and basis vectors. In particular, David Amos's wonderful HaskellForMaths uses

 
newtype Vect r a = Vect { runVect :: [(r,a)] }

for free vector spaces, only considering them up to linearity, paying for normalization as it goes.

Given the insight above we can see that Vect isn't a representation of vectors in the free vector space, but instead represents the covectors of that space, quite simply because Vect r a varies covariantly with change of basis!

Now the price of using the Monad on Vect r is that the monad denormalizes the representation. In particular, you can have multiple copies of the same basis vector., so any function that uses Vect r a has to merge them together.

On the other hand with the directly encoded linear functionals we've described here, we've placed no obligations on the consumer of a linear functional. They can feed the directly encoded linear functional any vector they want!

In fact, it'll even be quite a bit more efficient to compute,

To see this, just consider:

 
instance MultiplicativeMonoid r => Monad (Vect r) where
   return a = Vect [(1,a)]
   Vect as >>= f = Vect
       [ (p*q, b) | (p,a) < - as, (q,b) <- runVect (f b) ]

Every >>= must pay for multiplication. Every return will multiply the element by one. On the other hand, the price of return and bind in Linear r is function application.

 
instance Monad (Linear r) where
  return a = Linear $ \k -> k a
  m >>= f = Linear $ \k -> m $* \a -> f a $* k

A Digression on Free Linear Functionals

To wax categorical for a moment, we can construct a forgetful functor U : Vect_F -> Set that takes a vector space over F to just its set of covectors.

 
F E = (E -> F, F,\f g x -> f x + g x ,\r f x -> r * f x)

using the pointwise constructions we built earlier.

Then in a classical setting, you can show that F is left adjoint to U.

In particular the witnesses of this adjunction provide the linear map from (E -> F) to V and the function E -> (V ~> F) giving a linear functional on V for each element of E.

In a classical setting you can go a lot farther, and show that all vector spaces (but not all modules) are free.

But in a constructive setting, such as Haskell, we need a fair bit to go back and forth, in particular we wind up need E to be finitely enumerable to go one way, and for it to have decidable equality to go in the other. The latter is fairly easy to see, because even going from E -> (E -> F) requires that we can define and partially apply something like Kronecker's delta:

 
delta :: (Rig r, Eq a) => e -> e -> r
delta i j | i == j = one
             | otherwise = zero

The Price of Power

The price we pay is that, given a Rig, we can go from Vect r a to Linear r a but going back requires a to be be finitely enumerable (or for our functional to satisfy other exotic side-conditions).

 
vectMap :: Rig r => Vect r a -> Linear r a
vectMap (Vect as) = Map $ \k -> sum [ r * k a | (r, a) < - as ]

You can still probe Linear r a for individual coefficients, or pass it a vector for polynomial evaluation very easily, but for instance determining a degree of a polynomial efficiently requires attaching more structure to your semiring, because the only value you can get out of Linear r a is an r.

Optimizing Linear Functionals

In both the Vect r and Linear r cases, excessive use of (>>=) without somehow normalizing or tabulating your data will cause a lot of repeated work.

This is perhaps easiest to see from the fact that Vect r never used the addition of r, so it distributed everything into a kind of disjunctive normal form. Linear r does the same thing.

If you look at the Kleisli arrows of Vect r or Linear r as linear mappings, then you can see that Kleisli composition is going to explode the number of terms.

So how can we collapse back down?

In the Kleisli (Vect r) case we usually build up a map as we walk through the list then spit the list back out in order having added up like terms.

In the Map r case, we can do better. My representable-tries package provides a readily instantiable HasTrie class, and the method:

 
memo :: HasTrie a => (a -> r) -> a -> r

which is responsible for providing a memoized version of the function from a -> r in a purely functional way. This is obviously a linear map!

 
memoMap :: HasTrie a => Map r a a
memoMap = Map memo

We can also flip memo around and memoize linear functionals.

 
memoLinear :: HasTrie a => a -> Linear r a
memoLinear = Linear . flip memo

Next time, (co)associative (co)algebras and the myriad means of multiplying (co)vectors!

Free Monads for Less (Part 3 of 3): Yielding IO

Edward Kmett — Fri, 24 Jun 2011 06:41:06 +0000

Last time, I said that I was going to put our cheap new free monad to work, so let's give it a shot.

Yield for Less

Last month at TPDC 2011, Roshan James and Amr Sabry presented Yield: Mainstream Delimited Continuations.

Without calling it such they worked with the free monad of the indexed store comonad. Ignoring the comonad, and just looking at the functor we can see that

 
data Store i o r = Store (i -> r) o
    deriving (Functor)

admits the operation

 
class Functor y => Yieldable y i o | y -> i o where
   yield :: o -> y i
 
instance Yieldable (Store i o) i o where
   yield = Store id

The free monad of Store i o is a nice model for asymmetric coroutines.

 
type Yield i o = Free (Store i o)
 
liftFree :: Functor f => f a -> Free f a
liftFree = Free . fmap Pure
 
instance Yieldable y i o => Yieldable (Free y) i o where
   yield = liftFree . yield

With its Monad, you can write computations like:

 
foo :: Num o => Yield () o ()
foo = do
   yield 1
   yield 2
   yield 3

or to streamline one of James and Sabry's examples

 
walk :: Traversable f => f o -> Yield i o (f i)
walk = traverse yield

is an asymmetric coroutine that yields each of the elements in a traversable container in turn, replacing them with the responses from whatever is driving the coroutine.

James and Sabry called this the naive frame grabbing implementation. It is inefficient for the same reasons that we discussed before about retraversing the common trunk in free monads in general. Note that the unchanging trunk here isn't the data structure that we're traversing, but instead the chain of Store i o actions we took to get to the current instruction.

James and Sabry then proceeded to optimize it by hitting it with Codensity.

 
type Iterator i o = Codensity (Yield i o)
 
instance (Monad y, Yieldable y i o) => Yieldable (Codensity y) i o where
   yield = liftCodensity . yield

But we've now seen that we can get away with something smaller and get the same benefits.

 
liftF :: Functor f => f a -> F f a
liftF f = F (\kp kf -> kf (fmap kp f))
 
instance Yieldable y i o => Yieldable (F y) i o where
   yield = liftF . yield

Flattened, and with the store untupled the new optimized representation looks like:

 
newtype Iterator i o a = Iterator
  { runIterator ::
    forall r. (a -> r) -> (o -> (i -> r) -> r) -> r)
  }

and provides the same performance improvements for asymmetric coroutines as the Codensity version, used by James and Sabry, which would flatten to the much larger and less satisfying:

 
newtype RSIterator i o a = RSIterator
    { runRSIterator :: forall r.
          (a -> (o -> (i -> r) -> r) -> r)
             -> (o -> (i -> r) -> r) -> r
    }

They proceed to give an encoding of delimited continuations into this type and vice versa, but the rest of their material is of no further use to us here.

As an aside the performance benefits of encoding Oleg's iteratees in continuation passing style arise for much the same reason. The resuting encoding is a right Kan extension!

Who Needs the RealWorld?

As Runar recently tweeted, we have put this to good use here at ClariFI. (Yes, we are hiring! If the contents of my blog make sense to you then email me and let's talk.)

At ClariFI have a strongly typed functional language that bears a strong resemblance to Haskell with rank-n types and a number of other interesting type system features that are particularly suited to our problem domain.

However, as with Haskell, we needed a story for how to deal with IO.

Now, GHC models IO with the type

 
newtype IO a =
   IO (State# RealWorld -> (# a, State# RealWorld #))

where they model IO by working in a strict state monad, passing around a real world that they promise not to mutate or copy. (In practice, the world is passed around as a 0-byte token.

This is somewhat problematic semantically, for a number of reasons.

First, There is always the risk of copying it or plumbing it through backwards, so we carefully hide the State# RealWorld from the end user. So this model really wants some notion of uniqueness or linear typing to render it perfectly safe. Heck, the entire Clean language arose from just trying to address this concern.

Second, you don't really get to pass the real world around! We have multiple cores working these days. Stuff is happening in the back end, and as much as you might want it to be, your program isn't responsible for everything that happens in the RealWorld!.

Third, if in some sense all bottoms are the same, then forever (putStrLn "Hello World") and undefined are the same in that sense, despite the slew of side-effects that arise from the first one. Now, in Haskell you are allowed to catch some bottoms in the IO monad, and thereby escape from certain doom, but it is still a reasonable objection.

One alternate model for talking about IO is to view it as a free monad of some set of operations. This approach was taken by Wouter Swierstra's Functional Pearl: Data Types a la Carte.

You can then supply some sort of external interpreter that pumps that tree structure, performing the individual actions.

This is unsatisfying because of two things:

First, the performance is abysmal using the common ADT encoding of a free monad. Janis Voigtländer of course showed, that this can be rectified by using the Codensity monad.

Second, the set of FFI operations is closed.

What we've done instead is to define our primitive IO actions externally as some FFI type:

 
type FFI o i -- external, side-effecting computation taking o, returning i

In practice, these are obtained by reflection by our foreign import statements since we run in the JVM.

Then we looked at the free monad of

 
newtype OI a = forall o i. OI (FFI o i) o (i -> a) deriving Functor

where OI is the indexed store comonad used as the building block above, yielding arguments to FFI of type o, and representing a computation that would resume with a value of type i to obtain a result of type a.

In some sense this yields a more useful notion than Richard Kieburtz's novel, but largely unimplementable, OI comonad from Codata and Comonads in Haskell.

Flattening Free OI would yield the naive

 
-- data FIO a where
--    Return :: a -> FIO a
--    FIO :: FFI o i -> o -> (i -> FIO a) -> FIO a

which would be interpreted by the runtime system.

But once we've converted to our Church-encoded Free monad and flattened we obtain:

 
newtype IO a = IO
    (forall r. (a -> r) ->
                 (forall i o. FFI o i -> o -> (i -> r) -> r) ->
                 r)

with the Functor and Monad instances defined above.

This then gives us a number of choices on how we implement the runtime system:

We can use the machinery described earlier to convert from IO a to Free OI a or FIO a, and then have the runtime system pattern match on that structure on our main method, taking the FFI actions and their arguments and passing the results in to the language, or we can invert control, and implement things more directly by just defining

 
FFI = (->)

while letting the FFI'd methods have side-effects, and then defining

 
unsafePerformIO :: IO a -> a
unsafePerformIO (IO m) = m id (\ oi o ir -> ir (oi o))

But regardless of how FFI is implemented, this model provides a clear structural difference between forever (putStrLn "Hello") and undefined and does not require us to believe the pleasant fiction that we can get our hands on the real world and pass it around.

Our actual IO representation is only slightly more complicated than the one presented here in order to deal with the plumbing of an extra continuation to deal with Java exceptions, but the substance of this approach isn't changed by this addition.

[Edit: incorporated a minor typographical fix into Iterator from Max Bolingbroke]
[Edit: fixed Store to be data, an liftM that should have been an fmap and added the missing Functor constraint that was present in my actual implementation but didn't make it to the web, and a couple of typos in the implementation of RSIterator, all noted by Clumsy.]

Free Monads for Less (Part 2 of 3): Yoneda

Edward Kmett — Fri, 24 Jun 2011 04:49:46 +0000

Last time, I started exploring whether or not Codensity was necessary to improve the asymptotic performance of free monads.

This time I'll show that the answer is no; we can get by with something smaller.

The Yoneda Lemma

Another form of right Kan extension arises from the Yoneda lemma.

I covered it briefly in my initial article on Kan extensions, but the inestimable Dan Piponi wrote a much nicer article on how it implies in Haskell that given a Functor instance on f, this type

 
newtype Yoneda f a = Yoneda (forall r. (a -> r) -> f r)

is isomorphic to f a, witnessed by these natural transformations:

 
liftYoneda :: Functor f => f a -> Yoneda f a
liftYoneda a = Yoneda (\f -> fmap f a)
 
lowerYoneda :: Yoneda f a -> f a
lowerYoneda (Yoneda f) = f id

That said, you are not limited to applying Yoneda to types that have Functor instances.

This type and these functions are provided by Data.Functor.Yoneda from the kan-extensions package.

Codensity vs. Yoneda

Note, Yoneda f is in some sense smaller than Codensity f, as Codensity f a is somewhat 'bigger' than f a, despite providing an embedding, while Yoneda f a is isomorphic.

For example, Codensity ((->) s) a is isomorphic to State s a, not to s -> a as shown by:

 
instance MonadState s (Codensity ((->) s)) where
   get = Codensity (\k s -> k s s)
   put s = Codensity (\k _ -> k () s)

Now, Codensity is a particular form of right Kan extension, which always yields a Monad, without needing anything from f.

Here we aren't so fortunate, but we do have the fact that Yoneda f is always a Functor, regardless of what f is, as shown by:

 
instance Functor (Yoneda f) where
  fmap f (Yoneda m) = Yoneda (\k -> m (k . f))

which was obtained just by cutting and pasting the appropriate definition from Codensity or ContT, and comes about because Yoneda is a right Kan extension, like all of those.

To get a Monad instance for Yoneda f we need to lean on f somehow.

One way is to just borrow a Monad instance from f, since f a is isomorphic to Yoneda f a, if we have a Functor for f, and if we have a Monad, we can definitely have a Functor.

 
instance Monad m => Monad (Yoneda m) where
  return a = Yoneda (\f -> return (f a))
  Yoneda m >>= k = Yoneda (\f -> m id >>= \a -> runYoneda (k a) f)

Map Fusion and Reassociating Binds

Unlike Codensity the monad instance above isn't very satisfying, because it uses the >>= of the underlying monad, and as a result the >>=s will wind up in the same order they started.

On the other hand, the Functor instance for Yoneda f is still pretty nice because the (a -> r) part of the type acts as an accumulating parameter fusing together uses of fmap.

This is apparent if you expand lowerYoneda . fmap f . fmap g . liftYoneda , whereupon you can see we only call fmap on the underlying Functor once.

Intuitively, you can view Yoneda as a type level construction that ensures that you get fmap fusion, while Codensity is a type level construction that ensures that you right associate binds. It is important to note that Codensity also effectively accumulates fmaps, as it uses the same definition for fmap as Yoneda!

With this in mind, it doesn't usually make much sense to use Codensity (Codensity m) or Yoneda (Yoneda m) because the purpose being served is redundant.

Less obviously, Codensity (Yoneda m) is also redundant, because as noted above, Codensity also does fmap accumulation.

Other Yoneda-transformed Monads

Now, I said one way to define a Monad for Yoneda f was to borrow an underlying Monad instance for f, but this isn't the only way.

Consider Yoneda Endo. Recall that Endo from Data.Monoid is given by

 
newtype Endo a = Endo { appEndo :: a -> a }

Clearly Endo is not a Monad, it can't even be a Functor, because a occurs in both positive and negative position.

Nevertheless Yoneda Endo can be made into a monad -- the continuation passing style version of the Maybe monad!

 
newtype YMaybe a = YMaybe (forall r. (a -> r) -> r -> r)

I leave the rather straightforward derivation of this Monad for the reader. A version of it is present in monad-ran.

This lack of care for capital-F Functoriality also holds for Codensity, Codensity Endo can be used as a two-continuation list monad. It is isomorphic to the non-transformer version of Oleg et al.'s LogicT, which is available on hackage as logict from my coworker, Dan Doel.

The Functor, Applicative, Monad, MonadPlus and many other instances for LogicT can be rederived in their full glory from Codensity (GEndo m) automatically, where

 
newtype GEndo m r = GEndo (m r -> m r)

without any need for conscious thought about how the continuations are plumbed through in the Monad.

Bananas in Space

One last digression,

 
newtype Rec f r = (f r -> r) -> r

came up once previously on this blog in Rotten Bananas. In that post, I talked about how Fegaras and Sheard used a free monad (somewhat obliquely) in "Revisiting catamorphisms over datatypes with embedded functions" to extend catamorphisms to deal with strong HOAS, and then talked further about how Stephanie Weirich and Geoffrey Washburn used Rec to replace the free monad used by Fegaras and Sheard. That said, they did so in a more restricted context, where any mapping was done by giving us both an embedding and a projection pair.

Going to Church

We can't just use Rec f a instead of Free f a here, because Free f a is a functor, while Rec f a is emphatically not.

However, if we apply Yoneda to Rec f, we obtain a Church-encoded continuation-passing-style version of Free!

 
newtype F f a = F { runF :: forall r. (a -> r) -> (f r -> r) -> r }

Since this is of the form of Yoneda (Rec f), it is clearly a Functor:

 
instance Functor (F f) where
   fmap f (F g) = F (\kp -> g (kp . f))

And nicely, without knowing anything about f, we also get a Monad!

 
instance Monad (F f) where
   return a = F (\kp _ -> kp a)
   F m >>= f = F (\kp kf -> m (\a -> runF (f a) kp kf) kf)

But when we >>= all we do is change the continuation for (a -> r), leaving the f-algebra, (f r -> r), untouched.

Now, F is a monad transformer:

 
instance MonadTrans F where
   lift f = F (\kp kf -> kf (liftM kp f))

which is unsurprisingly, effectively performing the same operation as lifting did in Free.

Heretofore, we've ignored everything about f entirely.

This has pushed the need for the Functor on f into the wrapping operation:

 
instance Functor f => MonadFree f (F f) where
   wrap f = F (\kp kf -> kf (fmap (\ (F m) -> m kp kf) f))

Now, we can clearly transform from our representation to any other free monad representation:

 
fromF :: MonadFree f m => F f a -> m a
fromF (F m) = m return wrap

or to it from our original canonical ADT-based free monad representation:

 
toF :: Functor f => Free f a -> F f a
toF xs = F (\kp kf -> go kp kf xs) where
  go kp _  (Pure a) = kp a
  go kp kf (Free fma) = kf (fmap (go kp kf) fma)

So, F f a is isomorphic to Free f a.

So, looking at Codensity (F f) a as Codensity (Yoneda (Rec f)), it just seems silly.

As we mentioned before, we should be able to go from Codensity (Yoneda (Rec f)) a to Codensity (Rec f) a, since Yoneda was just fusing uses of fmap, while Codensity was fusing fmap while right-associating (>>=)'s.

Swallowing the Bigger Fish

So, the obvious choice is to try to optimize to Codensity (Rec f) a. If you go through the motions of encoding that you get:

 
newtype CF f a = CF (forall r. (a -> (f r -> r) -> r) -> (f r -> r) -> r)

which is in some sense larger than F f a, because the first continuation gets both an a and an f-algebra (f r -> r).

But tellingly, once you write the code, the first continuation never uses the extra f-algebra you supplied it!

So Codensity (Yoneda (Rec f)) a gives us nothing of interest that we don't already have in Yoneda (Rec f) a.

Consequently, in this special case rather than letting Codensity (Yoneda x) a swallow the Yoneda to get Codensity x a we can actually let the Yoneda swallow the surrounding Codensity obtaining Yoneda (Rec f) a, the representation we started with.

Scott Free

Finally, you might ask if a Church encoding is as simple as we could go. After all a Scott encoding

 
newtype ScottFree f a = ScottFree
    { runScottFree :: forall r.
       (a -> r) -> (f (ScottFree f a) -> r) -> r
    }

would admit easier pattern matching, and a nice pun, and seems somewhat conceptually simpler, while remaining isomorphic.

But the Monad instance:

 
instance Functor f => Monad (ScottFree f) where
   return a = ScottFree (\kp _ -> kp a)
   ScottFree m >>= f = ScottFree
       (\kb kf -> m (\a -> runScottFree (f a) kb kf) (kf . fmap (>>= f)))

needs to rely on the underlying bind, and you can show that it won't do the right thing with regards to reassociating.

So, alas, we cannot get away with ScottFree.

Nobody Sells for Less

So, now we can rebuild Voigtländer's improve using our Church-encoded / Yoneda-based free monad F, which is precisely isomorphic to Free, by using

 
lowerF :: F f a -> Free f a
lowerF (F f) = f Pure Free

to obtain

 
improve :: (forall a. MonadFree f m => m a) -> Free f a
improve m = lowerF m

And since our Church-encoded free monad is isomorphic to the simple ADT encoding, our new solution is as small as it can get.

Next time, we'll see this construction in action!

Free Monads for Less (Part 1 of 3): Codensity

Edward Kmett — Fri, 24 Jun 2011 03:59:58 +0000

A couple of years back Janis Voigtländer wrote a nice paper on how one can use the codensity monad to improve the asymptotic complexity of algorithms using the free monads. He didn't use the name Codensity in the paper, but this is essentially the meaning of his type C.

I just returned from running a workshop on domain-specific languages at McMaster University with the more than able assistance of Wren Ng Thornton. Among the many topics covered, I spent a lot of time talking about how to use free monads to build up term languages for various DSLs with simple evaluators, and then made them efficient by using Codensity.

This has been shown to be a sufficient tool for this task, but is it necessary?

First, some context:

Monads for Free

Given that f is a Functor, we get that

 
data Free f a = Pure a | Free (f (Free f a))

is a Monad for free:

 
instance Functor f => Functor (Free f) where
   fmap f (Pure a) = Pure (f a)
   fmap f (Free as) = Free (fmap (fmap f) as)
 
instance Functor f => Monad (Free f) where
   return = Pure
   Pure a >>= f = f a -- the first monad law!
   Free as >>= f = Free (fmap (>>= f) as)

The definition is also free in a particular categorical sense, that if f is a monad, then, and you have a forgetful functor that forgets that it is a monad and just yields the functor, then the the free construction above is left adjoint to it.

This type and much of the code below is actually provided by Control.Monad.Trans.Free in the comonad-transformers package on hackage.

For a while, Free lived in a separate, now defunct, package named free with its dual Cofree, but it was merged into comonad-transformers due to complications involving comonads-fd, the comonadic equivalent of the mtl. Arguably, a better home would be transformers, to keep symmetry.

Free is a Monad Transformer

 
instance MonadTrans Free where
    lift = Free . liftM Pure

and there exists a retraction for lift

 
retract :: Monad f => Free f a -> f a
retract (Pure a) = return a
retract (Free as) = as >>= retract

such that retract . lift = id. I've spoken about this on Stack Overflow, including the rather trivial proof, previously.

This lets us work in Free m a, then flatten back down to a single layer of m.

This digression will be useful in a subsequent post.

MonadFree

What Janis encapsulated in his paper is the notion that we can abstract out the extra power granted by a free monad to add layers of f to some monad m, and then use a better representation to improve the asymptotic performance of the monad.

The names below have been changed slightly from his presentation.

 
class (Monad m, Functor f) => MonadFree f m | m -> f where
    wrap :: f (m a) -> m a
 
instance Functor f => MonadFree f (Free f) where
    wrap = Free

instances can easily be supplied to lift MonadFree over the common monad transformers. For instance:

 
instance (Functor f, MonadFree f m) => MonadFree f (ReaderT e m) where
    wrap fs = ReaderT $ \e -> wrap $ fmap (`runReaderT` e) fs

This functionality is provided by Control.Monad.Free.Class.

Janis then proceeded to define the aforementioned type C, which is effectively identical to

 
newtype Codensity f a = Codensity (forall r. (a -> f r) -> f r)

This type is supplied by Control.Monad.Codensity from my kan-extensions package on hackage.

I have spoken about this type (and another that will arise in a subsequent post) on this blog previously, in a series of posts on Kan Extensions. [ 1, 2, 3]

Codensity f is a Monad, regardless of what f is!

In fact, you can quite literally cut and paste much of the definitions for return, fmap, and (>>=) from the code for the ContT monad transformer! Compare

 
instance Functor (Codensity k) where
  fmap f (Codensity m) = Codensity (\k -> m (k . f))
 
instance Monad (Codensity f) where
  return x = Codensity (\k -> k x)
  m >>= k = Codensity (\c -> runCodensity m (\a -> runCodensity (k a) c))
 
instance MonadTrans Codensity where
   lift m = Codensity (m >>=)

from Control.Monad.Codensity in kan-extensions with

 
instance Functor (ContT r m) where
    fmap f m = ContT $ \c -> runContT m (c . f)
 
instance Monad (ContT r m) where
    return a = ContT ($ a)
    m >>= k  = ContT $ \c -> runContT m (\a -> runContT (k a) c)
 
instance MonadTrans (ContT r) where
    lift m = ContT (m >>=)

from the Control.Monad.Trans.Cont in transformers.

Codensity m a is effectively forall r. ContT r m a. This turns out to be just enough of a restriction to rule out the use of callCC, while leaving the very powerful fact that when you lower them back down using

 
lowerCodensity :: Monad m => Codensity m a -> m a
lowerCodensity (Codensity m) = m return

 
runContT :: ContT r m a -> m r
runContT (ContT m) = m return

ContT and Codensity both yield a result in which all of the uses of the underlying monad's (>>=) are right associated.

This can be convenient for two reasons:

First, some almost-monads are not associative, and converting to ContT or Codensity can be used to fix this fact.

Second, in many monads, when you build a big structure using left associated binds, like:

 
    (f >>= g) >>= h

rather than use right associated binds like

 
   f >>= \x -> g x >>= h

then you wind up building a structure, then tearing it down and building up a whole new structure. This can compromise the productivity of the result, and it can also affect the asymptotic performance of your code.

Even though the monad laws say these two yield the same answer.

The dual of substitution is redecoration

To see that, first, it is worth noting that about ten years back, Tarmo Uustalu and Varmo Vene wrote "The dual of substitition is redecoration", which among other things quite eloquently described how monads are effectively about substituting new tree-like structures, and then renormalizing.

This can be seen in terms of the more categorical viewpoint, where we define a monad in terms of return, fmap and join, rather than return and (>>=). In that presentation:

 
m >>= f = join (fmap f m)

fmap is performing substitution. and join is dealing with any renormalization.

Done this way, (m >>= f) on the Maybe monad would first fmap to obtain Just (Just a), Just Nothing or Nothing before flattening.

In the Maybe a case, the association of your binds is largely immaterial, the normalization pass fixes things up to basically the same size, but in the special case of a free monad the monad is purely defined in terms of substitution, since:

 
-- join :: Functor f => Free f (Free f a) -> Free f a
-- join (Pure a) = a
-- join (Free as) = Free (fmap join as)

This means that every time you >>= a free monad you are accumulating structure -- structure that you have traverse past to deal with subsequent left-associated invocations of >>=! Free monads never shrink after a bind and the main body of the tree never changes.

More concretely, you could build a binary tree with

 
-- data Tree a = Tip a | Bin (Tree a) (Tree a)

and make a monad out of it, writing out your return and (>>=), etc. by hand

The same monad could be had 'for free' by taking the free monad of

 
data Bin a = Bin a a
    deriving (Functor, Foldable, Traversable)
    -- using LANGUAGE DeriveFunctor, DeriveFoldable, DeriveTraversable

yielding the admittedly slightly less convenient type signature

 
type Tree = Free Bin

Now you can use return for Tip, and

 
bin :: MonadFree Bin m => m a -> m a -> m a
bin l r = wrap (Bin l r)

to construct a binary tree node, using any free monad representation.

Now, if that representation is Free Bin (or the original more direct Tree type above) then code that looks like f >>= \x -> g x >>= h performs fine, but (f >>= g) >>= h will retraverse the unchanging 'trunk' of the tree structure twice. That isn't so bad, but given n uses of >>= we'll traverse an ever-growing trunk over and over n times!

Putting Codensity to Work

But, we have a tool that can fix this, Codensity!

 
instance MonadFree f m => MonadFree f (Codensity m) where
  wrap t = Codensity (\h -> wrap (fmap (\p -> runCodensity p h) t))

Janis packaged up the use of Codensity into a nice combinator that you can sprinkle through your code, so that your users never need know it exists. Moreover, it prevents them from accidentally using any of the extra power of the intermediate representation. If your code typechecks before you use improve somewhere within it, and it type checks after, then it will yield the same answer.

 
improve :: Functor f => (forall m. MonadFree f m => m a) -> Free f a
improve m = lowerCodensity m

By now, it should be clear that the power of Codensity is sufficient to the task, but is it necessary?

More Soon.

[Edit; Fixed minor typographical errors pointed out by ShinNoNoir, ivanm, and Josef Svenningsson, including a whole bunch of them found by Noah Easterly]

Introducing Speculation

Edward Kmett — Thu, 22 Jul 2010 19:53:03 +0000

A couple of days ago, I gave a talk at Boston Haskell about a shiny new speculative evaluation library, speculation on hackage, that I have implemented in Haskell. The implementation is based on the material presented as "Safe Programmable Speculative Parallelism" by Prakash Prabhu, G Ramalingam, and Kapil Vaswani at last month's PLDI.

I've uploaded a copy of my slides here:

* Introducing Speculation [PowerPoint | PDF]

This package provides speculative function application and speculative folds. Speculative STM transactions take the place of the transactional rollback machinery from the paper, but transactions are not always required in pure code. To get a feel for the shape of the library, here is an excerpt from the documentation for one of the combinators:

spec :: Eq a => a -> (a -> b) -> a -> b

spec g f a evaluates f g while forcing a, if g == a then f g is returned, otherwise f a is evaluated and returned. Furthermore, if the argument has already been evaluated, we skip the f g computation entirely. If a good guess at the value of a is available, this is one way to induce parallelism in an otherwise sequential task. However, if the guess isn't available more cheaply than the actual answer, then this saves no work and if the guess is wrong, you risk evaluating the function twice. Under high load, since f g is computed via the spark queue, the speculation will be skipped and you will obtain the same answer as f $! a.

ASCII art time-lines of how this can speed up evaluation are available in both the slides and the documentation linked to above, but assuming an otherwise serial problem, you effectively wager otherwise idle CPU time and the time to generate your guess on the quality of your guess.

Note that numSparks# feature request that was mentioned in the slides has already been implemented in GHC HEAD, and support shall be added to improve the performance of the speculative STM transactions under high load as mentioned in the slides.