December 30, 2024

Overloading the lambda abstraction in Haskell

About two years ago, I started working on a little embedded
domain-specific language (EDSL) called achille, using Haskell. Because this EDSL
has its own notion of morphisms Recipe m a b from
a to b, I was looking for a way to let users
write such morphisms using regular Haskell functions and the syntax for
lambda abstraction, \x -> ....

As recently as 5 days ago, I was still convinced that there was no
way to do such a thing without requiring a lot of engineering effort,
either through GHC compiler plugins or with Template Haskell — two
things I’d rather stay far away from.

Indeed,

  • Haskell’s
    Arrow abstraction
    and the related proc
    notation
    are supposed to enable just that, but they don’t work as
    intended. It’s a difficult thing to implement
    properly
    and if
    proposals to fix it
    have been discussed at large, nothing made it
    upstream.

    But even if the notation worked properly (and was convenient to use),
    it is quite common to work with morphisms k a b that cannot
    possibly embed all Haskell functions a -> b,
    preventing us to use this notation as we would like, simply because we
    cannot define a suitable instance of Arrow k.

    There are solutions to this, like the great overloaded
    package, that fixes the proc notation and crucially makes
    it available to many categories that are not Arrows. This
    may very well be exactly what you — the reader — need to resolve this
    problem.

    However, I want to avoid making my library rely on compiler plugins
    at all cost, and think the proc notation, while way better
    than raw Arrow combinators, is still too verbose and
    restrictive. I want to write lambdas, and I want to apply
    my morphisms just like any other Haskell function
    . So no
    luck.

  • There is also this wonderful paper from Conal Elliott: “Compiling to
    Categories”
    .
    In it, he recalls that any function in the simply typed lambda calculus
    corresponds to an arrow in any cartesian closed category (CCC). What he
    demonstrated with the concat GHC
    plugin is that this well-known result can be used to lift any
    monomorphic Haskell function into a Haskell arrow of any user-defined
    Haskell CCC.

    This is very cool, but also an experimental research project. It’s
    very unstable and unreliable. It’s not packaged on Hackage so fairly
    difficult to install as a library, and looks hard to maintain precisely
    because it’s a compiler plugin.

    And again, not every category is cartesian-closed, especially if
    exponentials have to be plain Haskell functions — which concat
    enforces.

However, 5 days ago I stumbled upon some paper by
sheer luck. And well, I’m happy to report that “overloading” the
lambda abstraction is completely doable
. Not only
that: it is actually very easy and doesn’t require any advanced
Haskell feature, nor any kind of metaprogramming. No. library.
needed.

I cannot emphasize enough how powerful the approach appears to
be:

  • You can define your very own proc notation for
    any category you desire.

  • That is, as soon as you can provide an instance for
    Category k, you can already overload the lambda
    abstraction to write your morphisms. Even if your category isn’t an
    instance of Arrow because of this arr
    function. You do not have to be able to embed pure Haskell functions
    in your category
    .

  • The resulting syntax is way more intuitive than the
    proc notation, simply because you can manipulate morphims
    k a b of your category as plain old Haskell
    functions
    , and therefore compose them and apply them to variables
    just as any other Haskell function, using the same syntax.

  • Implementing the primitives for overloading the lambda
    abstraction is a matter of about 10 lines of Haskell.

It’s in fact so simple that I suspect it must already be documented
somewhere, but for the life of me I couldn’t find anything. So
here you go. I think this will be very useful for EDSL designers out
here.

A toy DSL for flow diagrams

So let’s say our EDSL is supposed to encode flow diagrams, with boxes
and wires. Boxes have distinguished inputs and outputs, and wires flow
from outputs of boxes to inputs of other boxes.

We can encode any flow diagram using the following
operations:

Evaluating Linear Functions to
Symmetric Monoidal Categories
(SMCs).

In this paper, they explain that if CCCs are models of the simply
typed lambda calculus, it is “well-known” that SMCs are models of the
linear simply-typed lambda calculus. And thus, they show how
they are able to evaluate linear functions (as in, Linear
Haskell linear functions) into arrows in any target SMC of your
choice.

They even released a library for that: linear-smc.
I implore you to go and take a look at both the paper and the library,
it’s very very smart. Sadly, it seems to have gone mostly
unnoticed.

This paper was the tipping point. Because my target category
is cartesian (ergo, I can duplicate values), I suspected that I could
remove almost all the complex machinery they had to employ to go from a
free cartesian category over a SMC to the underlying SMC. I was hopeful
that I could ditch Linear Haskell, too.

And, relieved as I am, I can tell you that yes: not only can all of
this be simplified (if you don’t care about SMCs or Linear Haskell), but
everything can be implemented in a handful lines of Haskell code.

Interface

So, the first thing we’re gonna look at is the interface
exposed to the users of your EDSL. These are the magic primitives that
we will have to implement.

First there is this (abstract) type Port r a. It is
meant to represent the output of a box (in our flow
diagrams) carrying information of type a.

Because the definition of Port r a is not
exported, there is crucially no way for the user to retrieve a
value of type a from it. Therefore, to use a “port
variable” in any meaningful way, they can only use the
operations on ports that you — the library designer —
export.

And now, the two other necessary primitives:


encode and decode can be defined for any
Category k. But if k is also
cartesian, we can provide the following additional primitives:

achille. You could imagine changing the
Embed constructor of Flow to the
following:

please tell me!

Expressive power

I’m very curious to see which kind of categories you can “compile”
Haskell functions to using this technique. I didn’t expect converting to
cartesian category’s morphisms to be so straightforward, and I’m looking
forward to see whether there is any fundamental limitation preventing us
to do the same for CCCs.

Recursive morphisms
and infinite structures

Someone
asked on Reddit
whether recursive morphisms can be written out using
this syntax, wihtout making decode loop forever.
I’ve thought about it for a bit, and I think precisely because this is
not CPS and we construct morphisms k a b rather
than Haskell functions, decode should not be a problem. It
should produce recursive morphisms where recursive occurences
are guarded by constructors. Because of laziness, looping is not a
concern.

But I haven’t tried this out yet. Since in achille recursive recipes can be
quite useful, I want to support them, and so I will definitely
investigate further.


Thank you for reading! I hope this was or will be useful to some of
you. I am also very grateful for Bernardy and Spiwack’s fascinating
paper and library, it quite literally made my week.

Feel free to go on the
r/haskell thread
related to this post.

Till next time, perhaps when I manage to release the next version of
achille, using this very trick that I
had been desperately looking for in the past 2 years.

2024-12-28 13:18:34

Leave a Reply

Your email address will not be published. Required fields are marked *