Advent of Code 2022: Day 3

Published on

Let’s use some type-level “magic” to tidy up a Haskell function!

The first version of my solution contained this function (with some definitions elided):

parseElfGroups :: String -> Maybe [ElfGroup]
parseElfGroups = go . lines
  where
    go :: [String] -> Maybe [ElfGroup]
    go (a : b : c : rest) = do
      a' <- traverse mkItem a
      b' <- traverse mkItem b
      c' <- traverse mkItem c
      rest' <- go rest
      pure $ ElfGroup (a', b', c') : rest'
    go _ = pure []

data Item
mkItem :: Char -> Maybe Item
newtype ElfGroup = ElfGroup ([Item], [Item], [Item])

go uses explicit recursion to iterate over a list, extracting groups of 3 elements and using them to create an ElfGroup. Notice the connection between the size of its input list and the size of the constructed tuple: go needs at least 3 elements. Let’s try to encode this fact in the types, to avoid silly mistakes when calling it!

First, here’s how I’d write it, in the land of magic:

parseElfGroups :: String -> Maybe [ElfGroup]
parseElfGroups =
  traverse parseElfGroup
    . splitInto 3
    . lines
  where
    parseElfGroup :: Vector 3 String -> Maybe ElfGroup
    parseElfGroup (Vector [a, b, c]) = do
      a' <- traverse mkItem a
      b' <- traverse mkItem b
      c' <- traverse mkItem c
      pure $ ElfGroup (a', b', c')

Reasons why this is obviously nonsense:

  1. splitInto seems to accept the number 3 at run time, and then magically use that number in its result type at compile time.
  2. parseElfGroup thinks the compiler is OK with us omitting a fallback case just because it has the number 3 in its type1.

But hey, some of the best things start out as nonsense, so let’s continue!

An approximation: splitInto3

Accepting some strategic compromises will get us part-way there. We know it’s impossible to use a run-time value at compile-time (because we cannot time-travel into the past), so let’s make the number 3 available at compile-time using a tuple:

splitInto3 :: [a] -> [(a, a, a)]
splitInto3 = \case
  (x : y : z : rest) -> (x, y, z) : splitInto3 rest
  _ -> []

parseElfGroups :: String -> Maybe [ElfGroup]
parseElfGroups =
  traverse parseElfGroup
    . splitInto3
    . lines
  where
    parseElfGroup ::
      (String, String, String) ->
      Maybe ElfGroup
    parseElfGroup (a, b, c) = do
      a' <- traverse mkItem a
      b' <- traverse mkItem b
      c' <- traverse mkItem c
      pure $ ElfGroup (a', b', c')

This is an improvement on the original, but consider what will happen when the elves form groups of 4: we would need to write splitInto4 and use it in an updated version of parseElfGroups. That sounds like a lot of work for what is meant to be a lazy language!

Developing a polymorphic splitInto

Let’s make a function that splits a list into a generic length known at compile time. What sort of thing might this function return? A quick internet search for “Haskell fixed size vector” revealed the package fixed-vector that gives us precisely what we want: VecList is a fixed-length cons-list.2

The phrase “generic length known at compile time” implies that our function needs some sort of type-level “length” parameter. So let’s try writing splitInto using explicit forall and a kind signature to make things clearer (the Arity constraint is required for us to use fromListM):

splitInto ::
  forall (n :: Nat) a.
  Arity n =>
  [a] ->
  [VecList n a]
splitInto l =
  let splitSize = fromEnum $ natVal (Proxy :: Proxy n)
      (front, back) = splitAt splitSize l
   in case fromListM front of
        Nothing -> []
        Just frontVec -> frontVec : splitInto back

Notice that:

And now, finally, we can write parseElfGroups with maximum satisfaction:

parseElfGroups :: String -> Maybe [ElfGroup]
parseElfGroups =
  traverse parseElfGroup
    . splitInto
    . lines
  where
    parseElfGroup :: VecList 3 String -> Maybe ElfGroup
    parseElfGroup vl = case convert vl of
      (a, b, c) -> do
        a' <- traverse mkItem a
        b' <- traverse mkItem b
        c' <- traverse mkItem c
        pure $ ElfGroup (a', b', c')

It is almost unbelievable, but this is even less code than the “magic land” version I originally proposed.


  1. The compiler actually is OK with this, because Haskell is a depraved language. It lets us off with a warning.↩︎

  2. There are in fact many libraries that provide such a type, but fixed-vector has a more ergonomic interface for our problem.↩︎

  3. Why does natVal require a Proxy argument? This is so the compiler can always infer the value of p from the call site. If we enable AllowAmbiguousTypes, then we could write a zero-argument version of natVal that requires a type application or type signature at the call site:

    natValZero ::
      forall (n :: Nat).
      KnownNat n =>
      Integer
    natValZero = natVal @n Proxy
    ↩︎