Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
Information on the compiler internals
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2007-10-12 (14:49)
From: Massimiliano Brocchini <ebrocchini@v...>
Subject: Information on the compiler internals

I've been looking for information on OCaml compiler internals during this week.
Gordon's reply is a nice starting point, but is there any
documentation on the compiler internals?
I don't need to have a very detailed description (but it would be
really interesting to study!), it would be sufficient to have access
to the compiler front end i.e. lexing, parsing, name resolution
(environment) and typing.

A simple hint on how to use the above mentioned phases in a black box
way would be nice.

Thanks in advance,
Massimiliano Brocchini

P.S. I'm collaborating to the development of a snippet manager to
facilitate libraries distribution, code reuse and to reduce binaries
size by linking only the minimum amount of code your application
depends on rather than the whole file (e.g. if in a toy program you
use only your application needs to be linked with map's code
and it's dependencies not the whole List module).

We already have a working prototype that uses ocamldoc, but we need to
access the parser to build a proper tool.

On 10/12/07, Gordon Henriksen <> wrote:
> On Oct 12, 2007, at 09:17, Christoph Sieghart wrote:
> Is there any documentation for adding a new architecture to ocamlopt? I
> would like to do a crosscompiler from one of the existing architectures to
> an embedded microcontroller.
> I have searched the mailinglist archives and the documenation, but have not
> found anything. Any pointers are welcome? Is my assumption that the major
> codegeneration work is done by the code in $caml/asmcomp?
> Christoph,
> Yes, asmcomp contains both the middle-end and the back-end code generators.
> Note that the architecture-specific features are injected by configure
> creating various symlinks of the form asmcomp/<foo>.ml ->
> asmcomp/<arch>/<foo>.ml. On one hand, this means you should be able to clone
> the contents of one of the asmcomp/<arch> subdirectories and get your
> project off to a start pretty quickly. On the other, ocamlopt is not a
> cross-compiler, so you may have a bit of a challenge just getting the paths
> to the cross tools into the right places without breaking ocamlc.
>  I'm sure you'll get more detailed pointers, but here's a quick overview...
> ocamlc and ocamlopt share code through the "Lambda" representation
> (bytecomp/lambda.mli). After this point, ocamlopt transfers control into
> asmcomp/, which has a fairly straightforward pass pipeline in
> Asmgen.compile_implementation.
> The Lambda representation is first translated into Closed Lambda
> (asmcomp/clambda.mli), which is similar except that closures are explicit.
> Next, ocamlopt transforms Clambda into its middle-end representation, C--.
> This form is somewhat well documented at and in
> various academic papers. The C-- representation is architecture-neutral in
> form, but not content. Target dependencies are injected through the Arch
> module, which specifies address sizes, endianness, etc. This is the point
> where displacement calculations are performed, etc.
> The C-- representation is the input to the architecture-specific back-end
> code generators, which are driven by the architecture-neutral
> Asmgen.compile_phrase and Asmgen.compile_fundecl. In particular, this
> pipeline is pleasantly self-documenting:
> let (++) x f = f x
> let compile_fundecl (ppf : formatter) fd_cmm =
>   Reg.reset();
>   fd_cmm (* <-- The C-- representation for the function *)
>   ++ Selection.fundecl
>   ++ pass_dump_if ppf dump_selection "After instruction selection"
>   ++ Comballoc.fundecl
>   ++ pass_dump_if ppf dump_combine "After allocation combining"
>   ++ liveness ppf
>   ++ pass_dump_if ppf dump_live "Liveness analysis"
>   ++ Spill.fundecl
>   ++ liveness ppf
>   ++ pass_dump_if ppf dump_spill "After spilling"
>   ++ Split.fundecl
>   ++ pass_dump_if ppf dump_split "After live range splitting"
>   ++ liveness ppf
>   ++ regalloc ppf 1
>   ++ Linearize.fundecl
>   ++ pass_dump_linear_if ppf dump_linear "Linearized code"
>   ++ Scheduling.fundecl
>   ++ pass_dump_linear_if ppf dump_scheduling "After instruction scheduling"
>   ++ Emit.fundecl
> You can identify the target-dependent phases by correlating the passes with
> the contents of a target subdirectory.  Have fun!
> — Gordon
> _______________________________________________
> Caml-list mailing list. Subscription management:
> Archives:
> Beginner's list:
> Bug reports: