# The Difference Between Shallow and Deep Embedding

Deep and shallow embedding are terms associated with Domain Specific Languages (DSL). A DSL is a language geared toward a specific domain. The dot language is an example of such a DSL for describing Graphs. Conceptually, a shallow embedding captures the semantics of the data of the domain in a data type and provides a fixed interpretation of the data, whereas a deep embedding goes beyond this and captures the semantics of the operations on the domain enabling variable interpretations.

We will illustrate this difference by embedding a simple expression language with summation, multiplication and constants in Haskell. Haskell is especially well-suited for and often used as a host language for embedded DSLs.

We express our language with the following interface. A type synonym Exp for normal Ints and three separate functions representing summation, multiplication, and constants.

We embedded the data of the domain in Haskell and provided functions for construction of the model and we can easily represent the calculation of an expression as $4 + 6 * 8$ with the following lines of Haskell:

The advantage of this embedding that calculating the value of our expression is very fast. Other than the value we cannot determine anything else regarding our expression. This becomes more problematic when we add variables to our language.

We change our type to contain binding information and add two functions to represent the assignment and usage of variables.

And in our naivity we can write the expression $x + 6 * 8$ as follows:

Obviously, evaluating this creates havoc! What is the value of x? We should, of course, have introduced it first:

Now we have assigned a value to x and we can safely use it in our expression.

Had we used a deep embedding we could have prevented the cataclysmic error by first checking whether each variable is assigned before it is used. We create a deep embedding of our expression by using a Haskell data type.

Note that we do not specify how the bindings should be stored, only that such a thing exists. We now define a function that checks whether we use a variable before it is defined.1

With the function above we can check whether an expression is well-formed. With our deep embedding we can even define transformations of our expression; e.g. differentiate with respect to a variable.

Deep embedding allows us to utilize the semantics of our model by defining multiple interpretations of our DSL. The downside is that just calculating the value of our expression has become slower due to the added overhead of the constructors, whereas the shallow embedding can be evaluated by only using Ints.

In short:

• Shallow embedding should be used when you only need a single interpretation or when you are in a hurry.
• Deep embedding should be used in all other cases.

More reading material on this subject:

1. Most often you should use folds (2) instead of this direct recursion.

# Combining Graphviz (Dot) and TikZ With Dot2tex

We all want to create good looking documents and good looking documents need good looking images. Because we want consistency and because we are lazy we want to do this as automatic as possible. That is why we use LaTeX, it creates beautifully typeset documents without much manual effort.

Similarly, we use graphviz to generate our graphs for us. It’s automatic layout is the best in the field and the (declarative) dot language is easy to understand and compact to write. We can either include the PDFs dot generated in our document by using \includegraphics or we could use the latex graphviz package, remember that we are lazy. We can easily get the image in our first example in our PDF.

There is a shadow side to using Graphviz/dot as well. There are two problems. Firstly, the image just looks a bit out of place around the nicely smoothed text in a PDF. Secondly, we lack the ability to use TeX code in our graph. This means we are limited to the formatting by dot and the graphs could therefore appear out of style with other figures in our document.

No worries, with TikZ it is possible to create very fancy graphs and images in general but you have to all the positioning manually! Imagine inserting a node and having to reorder everything!

Enter dot2tex it brings all the love of graphviz/dot to TeX/TikZ. Using dot2tex has many advantages:

1. Lets you write your graphs in familiar dot syntax;
2. Let dot – or whichever layout engine you prefer – determine the placement of your nodes and arrows;
3. Style your nodes however you want by using TikZ styles;
4. Optionally, fine-tune the graph by adding extra tikz drawings.

Rather than manually calling dot2tex for every dot file you have please use the dot2texi package. This is the interface to dot2tex and when used as follows generates the image as displayed in Figure 2.

For more TikZ goodness check out the example site.

Happy writing!

# Why You Should Switch to Declarative Programming

We are reaching limits of what is feasible with imperative languages and we should move to declarative languages.

When applications written in imperative languages grow, the code becomes convoluted. Why? Imperatively programmed applications contain statements such as if X do Y else do Z. As Y and Z contain invisible side-effects the correctness of the program relies on some implicit invariant. This invariant has to be maintained by the programmer or else the code will break. Thus each time a new feature is added to an application or a bug is fixed the code for the application gets more complex as keeping the invariant intact becomes harder. After a while the code becomes spaghetti-code and bugs are introduced as the programmer fails to maintain the invariant. This is going to happen despite the best intentions of the programmer to keep things clean. Why is this?

# Software and Building Architecture Compared

People tend to only understand what they can see. For most people it is difficult to grasp more abstract matters without somehow visualizing them. Software is an example of such an abstract matter. Let us visit the process of developing software through a comparison with developing a building.

# JCU App Installation Script

With this script it should be possible to install and build the JCU app in a local directory. It does not build the UHC for you. If that is wanted the option could be build in of course!

# Getting Rid of Programming JavaScript With Haskell

For my Experimentation Project at Utrecht University I ported the “JCU” application to Haskell. The JCU application is used to give Dutch High school students the opportunity to taste Prolog.

The project uses the Utrecht Haskell Compiler and its JavaScript backend. The UHC translates Haskell to Core and then translates this Core language to JavaScript. For more information on this see the blog of the creator of the UHC JavaScript backend.

Please read my report on this project. The project is hosted on GitHub in the following repositories:

update 28-01-2012: The keyword jscript in the UHC has been changed to js in order to avoid association with Microsoft’s JScript. Also new Object syntax is now available in the foreign import directives.

# A Transition to Static Site Generation

Today I’ve launched my new blog. It is based on Octopress and works by statically generating the pages and then syncing them with the server.

If you are for example on OS X Lion and installed XCode 4.2 and you run into weird errors like a missing gcc-4.2, and Homebrew throws errors like this:

Error: The linking step did not complete successfully The formula built, but is not symlinked into /usr/local

Please install the gcc package from this nice fellow: osx-gcc-installer

And if you are getting nagged by rb-fsevent. Change

to

Update The comments have been exported with the Wordpress plugin to Disqus. I’m currently looking at how to highlight code within Disqus comments.

# Caching Hackage

On several occasions I noticed that when performing a cabal update that the index was being downloaded at the rate of plus min 300 KB/s. Finally I got around to do something about this. I’ve set up a caching server located in Utrecht, The Netherlands. It is a caching proxy for the hackage repository. If you want to use it, add the following to your ~/.cabal/config file. (Or equivalent on Windows.)

Be sure to comment out the already existing remote-repo. Otherwise, cabal will download both indexes and merge them, and we don’t want this.

## The funny bit

Apparently this only helps if your machine is fast enough to process the index (untarring and all extra administration cabal performs).

Plainly getting the file from the cache:

And running cabal update with my cache as source:

And then finally, with the original repository:

So here we see that the user time is roughly the same but you spent almost three times more seconds waiting for your coffee to get cold. Any further speed improvements for cabal update will probably require optimalisation of the code.

## The caching server

I’m using Varnish to cache the request to hackage. And here is my config file. Please shoot if you see any improvements.