Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
High-performance bytecode interpreter in OCaml
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2007-08-15 (13:13)
From: Jon Harrop <jon@f...>
Subject: Re: [Caml-list] High-performance bytecode interpreter in OCaml
On Wednesday 15 August 2007 12:49:31 Joel Reymont wrote:
> Folks,
> I would like to write a high-performance VM in OCaml. I understand
> that OCaml itself uses either a threaded interpreter or a switch-
> style one.

The performance of interpreters is heavily dependent upon what exactly you're 
evaluating (both language and program properties). If you start with a naive 
term-level interpreter or rewriter then you can get an order of magnitude in 
performance by optimizing the interpreter without leaving OCaml or moving to 
(real) bytecode compilation.

> What's the most efficient way to write a bytecode interpreter in
> OCaml itself, though?

I would start with a simple term-level interpreter, optimize that by avoiding 
lookups as much as possible. Then maybe switch to arrays of instructions. 
Finally, maybe something that generated OCaml bytecode and dynamically loaded 
it. However, you get diminishing returns and the latter is difficult and 
might well be no faster.

The main advantage of a real bytecode is locality: the instructions and data 
are stored together in a contiguous array. This is ideally suited to a simple 
C interpreter. A bytecode in OCaml would be a string and, IMHO, OCaml is not 
very efficient at string munging.

> Would CPS style be inlined if I were to write a threaded interpreter
> this way?

Quite a bit can be usefully inlined if you crank up the command line argument 
and write in a certain style. However, this is of limited use in interpreters 
of general purpose languages as their evaluation requires loops and so forth. 
Also, increasing global inlining often degrades the performance of symbolic 

Depending upon your requirements, my recommendations are probably:

1. Optimize your term-level interpreter first and then consider compiling to a 
lower-level instruction array.

2. Don't bother compiling to OCaml bytecode.

3. Consider compiling to native code or doing something more sophisticated 
with your interpreter, like gathering statistics on the most common sequences 
of instructions and making the interpreter optimize itself.

If you send me the source I'll have a look over it and tell you any obvious 
optimizations I can think of.

Dr Jon D Harrop, Flying Frog Consultancy Ltd.
OCaml for Scientists