Representing DSL expressions in Haskell When you want to embed a DSL in Haskell you will almost certainly be faced with having some kind of expressions in your DSL. If you are lucky you can get away with using the Haskell types as the DSL type (i.e., use the meta types as object types). E.g., if you DSL needs integer expressions you can use the Haskell type Integer. But sometimes you're not that lucky, so you need to represent the DSL expressions as a Haskell data type. What are the options for representing expressions in Haskell? A normal data type Let's start with the most obvious one, an ordinary data type with constructors for the different kinds of expressions. As an example I'll have a simple expression language that has constants (of type Int), addition, less-or-equal, and conditional.
data Exp = ConI Int | Add Exp Exp | LE Exp Exp | Cond Exp Exp Exp deriving (Show)Just to exemplify, here's an evaluator for these expressions. We need a type of values that the evaluator can return as well.
data Val = ValI Int | ValB Bool deriving (Show) eval :: Exp -> Val eval (ConI i) = ValI i eval (Add x y) = case (eval x, eval y) of (ValI vx, ValI vy) -> ValI (vx + vy) _ -> error "Bad arguments to Add" eval (LE x y) = case (eval x, eval y) of (ValI vx, ValI vy) -> ValB (vx <= vy) _ -> error "Bad arguments to LE" eval (Cond x y z) = case (eval x) of ValB b -> if b then eval y else eval z _ -> error "Bad arguments to Cond"And a test run:
main = print $ eval e where e = Cond (LE (ConI 3) (ConI 5)) (Add (ConI 1) (ConI 2)) (ConI 0)prints ValI 3 This is OK, but look at all the error cases in the evaluator. Remember that these are expressions in a DSL, so we are going to have expression fragments in our Haskell code. If we make a mistake, like Cond (ConI 1) (ConI 1) (ConI 1) this is not going to be caught by the Haskell type checker since everything is of type Exp. Instead it is going to cause some error when the program is run. Being advocates of static type checking (why else use Haskell?), this is rather disgusting. GADTs So let's try a new and shiny feature in GHC, namely GADTs. (GADTs are new in ghc, but the idea is old; it's been well known in constructive type theory for ages, Kent Petersson and I suggested it as an addition to Haskell almost 15 years ago.) With a GADT you can specify more precise types for the constructors.
data Exp a where ConI :: Int -> Exp Int Add :: Exp Int -> Exp Int -> Exp Int LE :: Exp Int -> Exp Int -> Exp Bool Cond :: Exp Bool -> Exp a -> Exp a -> Exp aThese are the types we want. It's now impossible to construct ill typed value of type Exp t; it will be caught by the Haskell type checker. The evaluator looks very neat and natural with GADTs
eval :: Exp a -> a eval (ConI i) = i eval (Add x y) = eval x + eval y eval (LE x y) = eval x <= eval y eval (Cond x y z) = if eval x then eval y else eval zGADTs is really the way to go, but it has the disadvantage of not being standard Haskell. It can also get somewhat cumbersome when we have variables; the evaluator now needs a typed environment. It's certainly doable, but starting to look less nice. Phantom types Let's explore a third way that in some sense combines the previous two. The idea is to have an untyped representation, like Exp, but to use an abstract data type on top of it that only allows constructing well typed expressions. We start with the original type, but rename it to Exp'
data Exp' = ConI Int | Add Exp' Exp' | LE Exp' Exp' | Cond Exp' Exp' Exp' deriving (Show)On top of this we provide the type the user will see.
newtype Exp a = E Exp' deriving (Show) conI :: Int -> Exp Int conI = E . ConI add :: Exp Int -> Exp Int -> Exp Int add (E x) (E y) = E $ Add x y le :: Exp Int -> Exp Int -> Exp Bool le (E x) (E y) = E $ LE x y cond :: Exp Bool -> Exp a -> Exp a -> Exp a cond (E x) (E y) (E z) = E $ Cond x y zThe functions conI, add, le, and cond are the only ones the DSL user will see. Note that they have the same types as the constructors in the GADT. To ensure the DSL user can only use these we need to put it all in a module and export the right things.
module Exp(Exp, conI, add, le, cond) where ...We can write an evaluator again (inside the Exp module).
eval :: Exp a -> Val eval (E x) = eval' x eval' :: Exp' -> Val eval' (ConI i) = ValI i eval' (Add x y) = case (eval' x, eval' y) of (ValI vx, ValI vy) -> ValI (vx + vy) _ -> error "Bad arguments to Add" eval' (LE x y) = case (eval' x, eval' y) of (ValI vx, ValI vy) -> ValB (vx <= vy) _ -> error "Bad arguments to LE" eval' (Cond x y z) = case (eval' x) of ValB b -> if b then eval' y else eval' z _ -> error "Bad arguments to Cond"Not as elegant as the GADT evaluator, but at least the error cases are impossible when expressions are constructed using the exported interface. And the test would now look like
main = print $ eval e where e = cond (le (conI 3) (conI 5)) (add (conI 1) (conI 2)) (conI 0)Phantom types gets its name from the fact that the type variable in the definition of the Exp type doesn't occur anywhere in the constructors; so it's a phantom. Now and then people suggest implementing a dynamically typed version of Haskell. Phantom types is one of the things that makes such an implementation difficult. There are no values that carry the type information at runtime, the types only exist at compile time.