Munted monad

Time to repeat an annual ritual.

  1. Begin work on an audio program
  2. Get frustrated parsing input with C
  3. Bring in Scheme
  4. Get annoyed by eager evaluation and runtime exceptions
  5. Introduce Haskell concepts
  6. Can't fit them on one line
  7. Turn everything into combinators with APL symbols
  8. Never finish audio program

This time around I am writing a MIDI control shell for an Oberheim Matrix 1000 after fitting it with custom firmware. Last year I attempted C interop with Chicken Scheme for realtime audio, but ultimately found the approach too slow. There's very little performance overhead for MIDI though, so I'm taking another swing at it.

Because I'm coming at you from step 7, I won't spill any more ink on the C FFI. These notes concern themselves with approximating monadic control flow over MIDI command parsing, in a uniquely Schemey way.

The m word

It's worth remembering that monads are structures observed in nature, not language features. There may be a higher kinded type Monad that allows some languages to provide a generic interface over this structure, but you can still run them in dynamic langs. It's all about behavior; without any particular rigor, a monad is anything that implements the signature

(a → F b) → F a → F b

where F is a structure like a List, an Option, and Either, etc. Most dumb languages only have the first of these, but the humble flatMap function is indeed this monadic binding operation.

Since Scheme does not have this generic type, I defined everything exclusively for Either. That's usually all you actually need.

Why

I really hate exceptions. They were a mistake; simple as. This is a positive entry so I'll leave it there. Since MIDI is cheap, I think I can afford to wank a bit here.

The code

Most of the imports are for FFI reasons, but I left them.

(import (chicken bitwise) (chicken condition) (chicken file posix) 
        (chicken foreign) (chicken io) (chicken string) (srfi 4))

(define-syntax λ (syntax-rules () ((_ . ω) (lambda . ω))))
(define-syntax ? (syntax-rules () ((_ . ω) (if . ω))))
(define-syntax ← (syntax-rules () ((_ . ω) (define . ω))))
(define-syntax ∃ (syntax-rules () ((_ . ω) (let* . ω))))
(define-syntax either
  (syntax-rules ()
    ((_ f ...)
     (∃ ((ω (condition-case ((λ () f ...))
              (e (exn) (left (get-condition-property e 'exn 'message))))))
       (? (left? ω) ω (right ω))))))
(define-syntax for
  (syntax-rules (←)
    ((_ (← α β) ω Ω ...) (>>= (λ (α) (for ω Ω ...)) β))
    ((_ (  α β) ω Ω ...) (>>= (λ (α) (for ω Ω ...)) (right β)))
    ((_               ω) (right ω))))

(← NRPN-CMDS '((p1 0   0 63 64)
               (l1 1 -63 63 64)
               (y  2   0  3 64)
               (q1 3   0 63 64)))

(← ↑ car) (← ↓ cdr) (← ↑↓ cadr) (← ∘ compose) (← ≡ equal?) (← ∅ '()) 
(← ∅? null?) (← ρ length) (← ◇ conc) (← ⊂ cons) (← ∀ map) (← $ apply)
(← (∧ ω α) (and ω α))
(← (∨ ω α) (or ω α))
(← (I ω) ω)
(← (K ω) (λ (α) ω))
(← (C f) (λ (ω) (λ (α) (f ω α))))
(← (S g f) (λ (ω) (g ω (f ω))))
(← (J h g f) (λ (ω) (h (g ω) (f ω))))
(← (right ω) `(R ,ω))
(← (left ω) `(L ,ω))
(← (right? ω) (? (list? ω) (? (≡ 2 (ρ ω)) (≡ 'R (↑ ω)) #f) #f))
(← (left? ω) (? (list? ω) (? (≡ 2 (ρ ω)) (≡ 'L (↑ ω)) #f) #f))
(← (either? ω) (? ((J ∨ left? right?) ω) ω (left (◇ "not an either: " ω))))
(← (get Fω) (? (right? Fω) (↑↓ Fω) (error (↑↓ Fω))))
(← (⊙ f Fω) (∃ ((α (either? Fω))) (? (right? α) (right (f (↑↓ α))) α)))
(← (_⊙ f Fω) (∃ ((α (either? Fω))) (? (left? α) (left (f (↑↓ α))) α)))
(← (⊙⊙ f g Fω) (⊙ f (_⊙ g Fω)))
(← (⊥ Fω) (either (get (get Fω))))
(← (>>= f Fω) (⊥ (⊙ f Fω)))
(← (● Ff Fω) (>>= (λ (ω) (⊙ (λ (f) (f ω)) Ff)) Fω))
(← (*> Fω Fα) (>>= (K Fα) Fω))
(← ($> Fω α) (*> Fω (right α)))
(← (lift2 f Fα Fω) (● (⊙ (C f) Fα) Fω))
(← (sequence Fω) (foldr (λ (x acc) (lift2 ⊂ x acc)) (right ∅) Fω))
(← (break △ Fω) (? (left? Fω) (△ Fω) Fω))
(← (traverse f Fω) (call/cc (λ (△) (sequence (∀ (∘ ((C break) △) f) Fω)))))
(← (ι n Fω) (either (list-ref Fω n)))
(← (ensure p e ω) (? p (right ω) (left e)))
(← (s⊥ f e ω) (>>= (λ (α) (ensure α (◇ e ": " ω) α)) (either (f ω))))
(← (s⊥n ω) (s⊥ string->number "not number" ω))
(← (s⊥x ω) (s⊥ string->symbol "not symbol" ω))
(← (∈ ω k) (∃ ((α (assoc k ω))) (⊙ ↓ (ensure α (◇ "cmd not found: " k) α))))
(← ∈-nrpn ((C ∈) NRPN-CMDS))
(← (nrpn ω) (for (← _ (ensure (≡ 2 (ρ ω)) ($ ◇ `("invalid arg count: " ,ω)) ∅))
                 (← cmd-str (ι 0 ω))
                 (← val-str (ι 1 ω))
                 (← cmd-sym (s⊥x cmd-str))
                 (← val (s⊥n val-str))
                 (← cmd (∈-nrpn cmd-sym))
                 (← cmd-byte (ι 0 cmd))
                 (← min (ι 1 cmd))
                 (← max (ι 2 cmd))
                 (← offset (ι 2 cmd))
                 (← _ (ensure (>= val min) (◇ val " < " min) ∅)) 
                 (← _ (ensure (<= val max) (◇ val " > " max) ∅))
                 `(,cmd-byte ,(+ offset val))))

Some of that exists only because I was having too much fun. I'll focus on the important bits.

Syntax improvements

It's slightly different every time, but I often start with something like this.

(define-syntax λ (syntax-rules () ((_ . ω) (lambda . ω))))
(define-syntax ? (syntax-rules () ((_ . ω) (if . ω))))
(define-syntax ← (syntax-rules () ((_ . ω) (define . ω))))
(define-syntax ∃ (syntax-rules () ((_ . ω) (let* . ω))))
(← ↑ car) (← ↓ cdr) (← ↑↓ cadr) (← ∘ compose) (← ≡ equal?) (← ∅ '()) 
(← ∅? null?) (← ρ length) (← ◇ conc) (← ⊂ cons) (← ∀ map) (← $ apply)
(← (∧ ω α) (and ω α))
(← (∨ ω α) (or ω α))

And some combinators.

(← (I ω) ω)
(← (K ω) (λ (α) ω))
(← (C f) (λ (ω) (λ (α) (f ω α))))
(← (S g f) (λ (ω) (g ω (f ω))))
(← (J h g f) (λ (ω) (h (g ω) (f ω))))

Don't worry. I didn't go crazy with point free stuff this time.

Either

Either encodes a binary possibility—usually success or failure—as Right or Left, with Right generally thought of as the "good state" and Left the "error". Chaining together Eithers is biased towards Right, where one Right action can be composed against another Right, but any Left will "short circuit" the computation with the failure reason. As such Either is good for all-or-nothing operations like parsing, where you don't want to continue after encountering garbage input.¹

Since this is Scheme, I've just implemented Right and Left as lists prefixed with R and L.

(← (right ω) `(R ,ω))
(← (left ω) `(L ,ω))

Since this is Scheme, I'm also forced to do a bit of runtime validation. I don't have types to fall back on.

(← (right? ω) (? (list? ω) (? (≡ 2 (ρ ω)) (≡ 'R (↑ ω)) #f) #f))
(← (left? ω) (? (list? ω) (? (≡ 2 (ρ ω)) (≡ 'L (↑ ω)) #f) #f))
(← (either? ω) (? ((J ∨ left? right?) ω) ω (left (◇ "not an either: " ω))))

The first two return booleans, but (either? ω) will check if ω is indeed an Either, then return Left if it's not. Running this once is enough to ensure you're always working in this context.

This is fine for pure data, but I've declared war on exceptions. My real weapon is the (either) macro.

(define-syntax either
  (syntax-rules ()
    ((_ f ...)
     (∃ ((ω (condition-case ((λ () f ...))
              (e (exn) (left (get-condition-property e 'exn 'message))))))
       (? (left? ω) ω (right ω))))))

The pattern matching syntax is odd to me, but this catches any exceptions into a Left instance with the error string inside. Successful computations will be Right.

(either (+ 1 1))
  (R 2)

(either (+ 1 "a"))
  (L "bad argument type - not a number")

Now errors are values like anything else. I'd be satisfied even if I stopped here, but I didn't.

Composing contexts

You rarely want to unwrap an Either; it almost always makes sense to map more functions into it, and have the outer Right or Left represent the health of your computation. You can map a function f into context 's Right or Left states with (⊙) and (_⊙) respectively. (⊙⊙) lets you cover both cases at once.

(← (⊙ f Fω) (∃ ((α (either? Fω))) (? (right? α) (right (f (↑↓ α))) α)))
(← (_⊙ f Fω) (∃ ((α (either? Fω))) (? (left? α) (left (f (↑↓ α))) α)))
(← (⊙⊙ f g Fω) (⊙ f (_⊙ g Fω)))

(⊙ (λ (ω) (+ ω 1)) (right 1))
  (R 2)

(_⊙ (K "generic err") (left "overly long stack trace here"))
  (L "generic err")

If you do want to unwrap the Either for whatever reason, then you'll get a naked value back from a Right or an actual error from a Left. This is obviously unsafe, but it has its uses.

(← (get Fω) (? (right? Fω) (↑↓ Fω) (error (↑↓ Fω))))

What happens if you map an f into an Either, but f itself returns Either?

(⊙ s⊥n (right "123"))
  (R (R 123))

This isn't very useful. The structure should remain flat, or else the number of mapping operations will continue to grow.

(← (⊥ Fω) (either (get (get Fω))))

(⊥ (⊙ s⊥n (right "123")))
  (R 123)

(⊥ (right (left "This is nonsensical state")))
  (L "This is nonsensical state")

Wrapping the hitherto dangerous (get) into another (either) catches inner and outer errors and crams them into one Left.

Mapping then flattening is of course flatMap: the fundamental monadic sequencing operation. So it's usually one function.

(← (>>= f Fω) (⊥ (⊙ f Fω)))

This is what I was after: the ability to chain multiple fallible procedures without stopping to catch everything—just trucking along optimistically.

(>>= (λ (ω) (ensure (even? ω) "not even" ω)) (s⊥n "a"))
  (L "not number: a")

(>>= (λ (ω) (ensure (even? ω) "not even" ω)) (s⊥n "123"))
  (L "not even")

(>>= (λ (ω) (ensure (even? ω) "not even" ω)) (s⊥n "122"))
  (R 122)

for comprehensions

Most of my parsing would just be a chain of (⊙) and (>>=) and that'd be well and good, but even I'll admit that legibility can suffer, especially with the extra noise of all these parentheses. There's also a more serious limitaton; naïve binding assumes that you'll only need each stage of computation once—you can't reference it again elsewhere. That's why Haskell and Scala provide syntax around (>>=) called do or for.

for α ← f
    β ← g α
    γ ← h α β
    γ + γ

This all compiles to (⊙) and (>>=) under the hood, but notice how a pseudo-variable α declared from a monadic operation f can be referenced later in γ; it has transcended its location in the chain. That's because this is one massive closure.

Think to the definition of (let) in Lisp. Under the hood it's just a macro that compiles

(let ((α 1) (ω 2)) (+ α ω))

to

((λ (α ω) (+ α ω)) 1 2)

and its recursive cousin (let*) goes from

(let ((α 1) (ω (+ α 1))) (+ α ω))

to

((λ (α) ((λ (ω) (+ α ω)) (+ α 1))) 1)

So α remains accessible to inner functions. (for) is almost identically defined, but with an extra (>>=).

(define-syntax for
  (syntax-rules (←)
    ((_ (← α β) ω Ω ...) (>>= (λ (α) (for ω Ω ...)) β))
    ((_ (  α β) ω Ω ...) (>>= (λ (α) (for ω Ω ...)) (right β)))
    ((_               ω) (right ω))))

Using a translates the action to a full monadic bind, while naked actions are simply mapped into the existing Either. Like anything else, this will short circuit on Left.

I haven't explained everything, but check out how easy this makes turning a MIDI NRPN command (cmd val) into its proper bytes.

(← (nrpn ω) (for (← _ (ensure (≡ 2 (ρ ω)) ($ ◇ `("invalid arg count: " ,ω)) ∅))
                 (← cmd-str (ι 0 ω))
                 (← val-str (ι 1 ω))
                 (← cmd-sym (s⊥x cmd-str))
                 (← val (s⊥n val-str))
                 (← cmd (∈-nrpn cmd-sym))
                 (← cmd-byte (ι 0 cmd))
                 (← min (ι 1 cmd))
                 (← max (ι 2 cmd))
                 (← offset (ι 2 cmd))
                 (← _ (ensure (>= val min) (◇ val " < " min) ∅)) 
                 (← _ (ensure (<= val max) (◇ val " > " max) ∅))
                 `(,cmd-byte ,(+ offset val))))

If any one of these steps fails, the parser will break with the reason, but I'm not throwing or catching anything.

Taking it further

(sequence) and (traverse) apply this composition as a (fold) across a list. This is defined in terms of the O(n) (map) operation, but it will truly short circuit on a Left thanks to (call/cc). This could be an entire different article though. Basically everything is a traversal at the end of the day.

All that's left is the actual program.

Footnotes

  1. Compare this to form validation, where you do want a list of everything that is incorrect, not just the first offender. The short circuiting behavior of monads makes this impossible. Figuring out an alternative (closely related) structure is an exercise to the reader.