Deep and shallow embedding are terms associated with Domain Specific Languages (DSL). A DSL is a language geared toward a specific domain. The dot language is an example of such a DSL for describing Graphs. Conceptually, a shallow embedding captures the semantics of the data of the domain in a data type and provides a fixed interpretation of the data, whereas a deep embedding goes beyond this and captures the semantics of the operations on the domain enabling variable interpretations.
We will illustrate this difference by embedding a simple expression language with summation, multiplication and constants in Haskell. Haskell is especially wellsuited for and often used as a host language for embedded DSLs.
We express our language with the following interface. A type synonym Exp
for
normal Int
s and three separate functions representing summation,
multiplication, and constants.
1 2 3 4 5 

We embedded the data of the domain in Haskell and provided functions for construction of the model and we can easily represent the calculation of an expression as $4 + 6 * 8$ with the following lines of Haskell:
1


The advantage of this embedding that calculating the value of our expression is very fast. Other than the value we cannot determine anything else regarding our expression. This becomes more problematic when we add variables to our language.
We change our type to contain binding information and add two functions to represent the assignment and usage of variables.
1 2 3 4 

And in our naivity we can write the expression $x + 6 * 8$ as follows:
1


Obviously, evaluating this creates havoc! What is the value of x
? We should,
of course, have introduced it first:
1


Now we have assigned a value to x
and we can safely use it in our
expression.
Had we used a deep embedding we could have prevented the cataclysmic error by first checking whether each variable is assigned before it is used. We create a deep embedding of our expression by using a Haskell data type.
1 2 3 4 5 6 

Note that we do not specify how the bindings should be stored, only that such a thing exists. We now define a function that checks whether we use a variable before it is defined.^{1}
1 2 3 4 5 6 7 8 9 

With the function above we can check whether an expression is wellformed. With our deep embedding we can even define transformations of our expression; e.g. differentiate with respect to a variable.
1 2 3 4 5 6 7 

Deep embedding allows us to utilize the semantics of our model by defining
multiple interpretations of our DSL. The downside is that just calculating the
value of our expression has become slower due to the added overhead of the
constructors, whereas the shallow embedding can be evaluated by only using
Int
s.
In short:
 Shallow embedding should be used when you only need a single interpretation or when you are in a hurry.
 Deep embedding should be used in all other cases.
More reading material on this subject:
 This presentation by Josef Svenningsson.
 Combining Deep and Shallow Embedding for EDSL (Josef Svenningsson and Emil Axelsson, 2012)
 Certifying Machine Code Safety: Shallow versus Deep Embedding (Martin Wildmoser and Tobias Nipkow, 2004)
 Deep versus Shallow embeddings in Coq