Writing a WHILE interpreter in C++ templates
As the theoretical computer science lecture reached the topic of primitive recursive and μ-recursive functions, eventually LOOP and WHILE were introduced.
For those who don’t know, LOOP is a simple programming language which is able to compute all primitive recursive functions (and all LOOP programs can be written as primitive recursive functions) and WHILE is a bit more sophisticated, representing the μ-recursive functions and thus being turing complete.
For solving the tasks, it is handy to have a LOOP and WHILE interpreter around, so the first thing to do was to write such an interpreter in python. However, this is trivial.
At some point I found out that the C++ template language is turing complete. So I took the challenge of writing my own WHILE interpreter in C++ templates, being evaluated at compile time.
The language
Astonishingly, there is no wikipedia article for WHILE which I could link, so I have to describe the language for you, so we both are talking about the same thing.
Here’s an example of a WHILE program:
x0 := x2 LOOP x1 DO x0 := x0 + 1 END x3 := 1 WHILE x3 ≠ 0 DO x0 := x0 - 1 END
This is a pretty stupid program, but it illustrates the point. First, we have an arbitrary amount of variables, numbered x0 through xn, with n being a natural number. Assignments between variables are possible, even offset assignments, but we cannot add two variables together in one statement. Instead, we have to use a LOOP for that. LOOP xn DO loops as many times as the value in the xn variable at the time the loop starts. So changing the value of xn inside the loop does not have any effect on the count of iterations.
This is different for the WHILE xn ≠ 0 DO construct, which checks on each iteration whether the value in xn is nonzero before executing the body. So the result of this program (stored in x0) is x1+x2-1.
Variables can only store natural numbers, which means that the - operation is not an actual minus, but, in terms of C++, a-b is std::max(0, a-b). So if the result of the difference would be less than zero, zero is used instead.
Keeping state
The most difficult part was to keep state. This might be very subjective, because most of the time I do metaprogramming in C++, I use the templates to create code, not storage structures.
In my implementation, I keep state in a very inefficient manner—although I’m not entirely sure whether there is a more efficient way at all.
For storing the state, I use a recursive structure keeping track of the values assigned to the different variables:
typedef unsigned long int value_t;
typedef unsigned long int index_t;
typedef void NullState;
template <typename state, index_t _idx, value_t _v>
struct SetValue
{
typedef state upper_state;
static constexpr index_t idx = _idx;
static constexpr value_t v = _v;
};
template <typename state, index_t idx>
struct GetValue
{
static constexpr value_t v = (state::idx == idx ?
state::v :
GetValue<typename state::upper_state, idx>::v);
};
template <index_t idx>
struct GetValue<void, idx>
{
static constexpr value_t v = 0;
};
As you can see, for setting a value, another level of recursion is inserted at the start of the template chain, and for getting, we walk along the chain until we find the first node whose index matches the index of the variable we want to fetch.
This is probably horribly inefficient, just as promised. But this is a proof-of-concept, so I don’t care much.
To pass initial state to the program, which is per definition kept in the first k (with k being the number of parameters) variables starting with x1, I have a separate template constructing the state type:
template <index_t curri, value_t... vs>
struct RecursiveInitialState;
template <index_t curri, value_t _v, value_t... vs>
struct RecursiveInitialState<curri, _v, vs...>
{
typedef RecursiveInitialState<curri+1, vs...> upper_state;
static constexpr index_t idx = curri;
static constexpr value_t v = _v;
};
template <index_t curri, value_t _v>
struct RecursiveInitialState<curri, _v>
{
typedef void upper_state;
static constexpr index_t idx = curri;
static constexpr value_t v = _v;
};
template <value_t... initvs>
using InitialState = RecursiveInitialState<1, initvs...>;
This is not more efficient, but handier than using a whole lot of nested SetValue invocations.
Implementing the instructions
I decided to implement the instructions using nested templated structs. The outer struct defines the parameters of the instruction (e.g. the operands of the assignment) and the inner struct typedefs the results of the operation. This has the advantage that we can declare programs (types) without running them immediately.
The most trivial instruction is the assignment of a constant to a value, so we use it to illustrate the point:
template <index_t destindex, value_t cvalue>
struct ASSIGN_CONST
{
template <typename curr_state>
struct run
{
typedef SetValue<
curr_state,
destindex,
cvalue> state;
};
};
The outer, ASSIGN_CONST struct represents the instruction and the inner is used to run the actual operation, with the state type typedef’d to the state after the operation has executed, and the curr_state template argument being the state before the operation is executed.
The other assignment templates are similarily trivial, just using different operations:
template <index_t destindex, index_t srcindex, value_t offset>
struct ASSIGN_PLUS
{
template <typename curr_state>
struct run
{
typedef SetValue<
curr_state,
destindex,
GetValue<curr_state, srcindex>::v + offset> state;
};
};
template <index_t destindex, index_t srcindex, value_t noffset>
struct ASSIGN_MIN
{
template <typename curr_state>
struct run
{
private:
static constexpr value_t currvalue = GetValue<curr_state, srcindex>::v;
static constexpr value_t newvalue = (noffset > currvalue
? 0
: currvalue - noffset);
public:
typedef SetValue<
curr_state,
destindex,
newvalue> state;
};
};
Another operation which is often overlooked when looking at programming languages is the chaining of instructions, because it is so natural that we don’t see it as something separate.
template <typename... stmts>
struct CHAIN;
template <typename currstmt, typename... stmts>
struct CHAIN<currstmt, stmts...>
{
template <typename curr_state>
struct run
{
typedef typename CHAIN<stmts...>::template run<
typename currstmt::template run<curr_state>::state>::state state;
};
};
template <>
struct CHAIN<>
{
template <typename curr_state>
struct run
{
typedef curr_state state;
};
};
So a simple WHILE program resulting in x1+1 would be:
typedef CHAIN<
ASSIGN_PLUS<0, 1, 2>
> prgm1;
The more interesting part is implementing the looping constructs. It is obvious that we can implement LOOP in terms of WHILE, so I’ll only show the implementation of WHILE for the sake of sanity.
Implementing WHILE takes some tricky recursion, and it’s important to stop the compiler from running in an infinite recursion here. The only way I found to do that was to do partial specialization triggering on the current value of the variable:
template <typename curr_state, index_t condvar, typename body, value_t curr_condvar>
struct StepWhile
{
typedef typename StepWhile<
typename body::template run<curr_state>::state,
condvar,
body,
GetValue<curr_state, condvar>::v>::state state;
};
template <typename curr_state, index_t condvar, typename body>
struct StepWhile<curr_state, condvar, body, 0>
{
typedef curr_state state;
};
template <index_t condvar, typename body>
struct WHILE
{
template <typename curr_state>
struct run
{
typedef typename StepWhile<
curr_state,
condvar,
body,
GetValue<curr_state, condvar>::v>::state state;
};
};
What we do here, before evaluating the WHILE step, we already evaluate the value of the variable and pass the value to the template invocation. The template is then partially specialized to do nothing if the value is zero, so we have a breaking condition for the recursion.
Given that, we can run any WHILE program which does not hit the compilers template recursion limit. As stated, the evaluation takes place at compile time, which is shown by the following example:
typedef CHAIN<
ASSIGN_PLUS<0, 1, 0>,
ASSIGN_MIN<2, 2, 1>,
WHILE<2, CHAIN<
ASSIGN_MIN<2, 2, 1>,
ASSIGN_PLUS<0, 0, 1>
>
>
> prgm2;
static_assert(GetValue<prgm2::run<NullState>::state, 0>::v == 0, "");
static_assert(GetValue<prgm2::run<InitialState<10, 20>>::state, 0>::v == 30, "");
So there we have a nice WHILE interpreter written in C++ templates. And a constructive proof for the turing completeness of C++ templates