To Learn More
The techniques to compile for abstract machines were used
in the first generation of SmallTalk, then in the functional
languages LISP and ML. The argument that the use of abstract machines will
hinder performance has put a shadow on this technique for a long time. Now, the JAVA language has shown
that the opposite is true. An abstract machine provides several advantages. The first is to facilitate
the porting of a compiler to different architectures. The part of the compiler related to portability
has been well defined (the abstract machine interpreter and part of runtime library). Another benefit of this technique
is portable code. It is possible to compile an application on one architecture and execute it on
another. Finally, this technique simplifies compiler construction by adding specific instructions for the type of language
to compile. In the case of functional languages, the abstract machines make it easy to create the closures (packing
environment and code together) by adding the notion of execution environment to the abstract machine.
To compensate for the loss in efficiency caused by the use of the bytecode interpreter, one can
expand the set of abstract machine instructions to include those of a real
machine at runtime. This type of expansion has been found in the implementation of Lisp (llm3) and JAVA (JIT).
The performance increases, but does not reach the level of a native C compiler.
One difficulty of functional language compilation comes from closures.
They contain both the executable code and execution environment (see page ??).
The choice of implementation for the environment and the access of values in the environment has a significant influence on
the performance of the code produced. An important function of the environment consists of obtaining access
to values in constant time; the variables are viewed as indexes in an array containing their values. This requires the
preprocessing of functional expressions. An example can be found in L. Cardelli's book - Functional Abstract Machine.
Zinc uses this technique. Another crucial optimization is to avoid the construction of useless closures. Although all
functions in ML can be viewed as functions with only one argument, it is necessary to not create intermediate closures in
the case of application on several arguments. For example, when
the function add is applied with two integers, it is not useful to create the first closure corresponding to the
function of applying add to the first argument.
It is necessary to note that the creation of a closure would allocate certain memory space for the environment and would
require the recovery of that memory space in the future (9). Automatic memory recovery is the second
major performance concern, along with environment.
Finally, bootstrapping allows us to write the majority of a compiler with the
same language which it is going to compile. For this reason, like the chicken and the egg, it is necessary to define the
minimal part of the language which can be expanded later. In fact, this property is hardly appreciable for classifying the
languages and their implementations. This property is also used as a measure of the capability of a language to be used
in the implementation of a compiler. A compiler is a large program, and bootstrapping is a good test of it's correctness and
performance. The following are links to the references:
Link
http://caml.inria.fr/camlstone.txt
At that time, Caml was compiled over fifty machines, these were antecedent
versions of Objective CAML. We can get an idea of how the present Objective CAML has
been improved since then.