Chapter 11 Native-code compilation (ocamlopt)
This chapter describes the Objective Caml high-performance
native-code compiler ocamlopt, which compiles Caml source files to
native code object files and link these object files to produce
The native-code compiler is only available on certain platforms.
It produces code that runs faster than the bytecode produced by
ocamlc, at the cost of increased compilation time and executable code
size. Compatibility with the bytecode compiler is extremely high: the
same source code should run identically when compiled with ocamlc and
It is not possible to mix native-code object files produced by ocamlopt
with bytecode object files produced by ocamlc: a program must be
compiled entirely with ocamlopt or entirely with ocamlc. Native-code
object files produced by ocamlopt cannot be loaded in the toplevel
11.1 Overview of the compiler
The ocamlopt command has a command-line interface very close to that
of ocamlc. It accepts the same types of arguments, and processes them
The output of the linking phase is a regular Unix or Windows
executable file. It does not need ocamlrun to run.
The following command-line options are recognized by ocamlopt.
The options -pack, -a, -shared, -c and -output-obj are mutually exclusive.
Build a library (.cmxa and .a/.lib files) with the object files
(.cmx and .o/.obj files) given on the command line, instead of
linking them into an executable file. The name of the library must be
set with the -o option.
If -cclib or -ccopt options are passed on the command
line, these options are stored in the resulting .cmxa library. Then,
linking with this library automatically adds back the
-cclib and -ccopt options as if they had been provided on the
command line, unless the -noautolink option is given.
Dump detailed information about the compilation (types, bindings,
tail-calls, etc). The information for file src.ml
is put into file src.annot. In case of a type error, dump
all the information inferred by the type-checker before the error.
The src.annot file can be used with the emacs commands given in
emacs/caml-types.el to display types and other annotations
Compile only. Suppress the linking phase of the
compilation. Source code files are turned into compiled files, but no
executable file is produced. This option is useful to
compile modules separately.
- -cc ccomp
Use ccomp as the C linker called to build the final executable
and as the C compiler for compiling .c source files.
- -cclib -llibname
Pass the -llibname option to the linker. This causes the given
C library to be linked with the program.
- -ccopt option
Pass the given option to the C compiler and linker. For instance,
-ccopt -Ldir causes the C linker to search for C libraries in
Optimize the produced code for space rather than for time. This
results in slightly smaller but slightly slower programs. The default is to
optimize for speed.
Print the version number of ocamlopt and a detailed summary of its
configuration, then exit.
- -for-pack module-path
Generate an object file (.cmx and .o/.obj files) that can later be included
as a sub-module (with the given access path) of a compilation unit
constructed with -pack. For instance, ocamlopt -for-pack P -c A.ml
will generate a.cmx and a.o files that can later be used with
ocamlopt -pack -o P.cmx a.cmx.
Add debugging information while compiling and linking. This option is
required in order to produce stack backtraces when
the program terminates on an uncaught exception (see
Cause the compiler to print all defined names (with their inferred
types or their definitions) when compiling an implementation (.ml
file). No compiled files (.cmo and .cmi files) are produced.
This can be useful to check the types inferred by the
compiler. Also, since the output follows the syntax of interfaces, it
can help in writing an explicit interface (.mli file) for a file:
just redirect the standard output of the compiler to a .mli file,
and edit that file to remove all declarations of unexported names.
- -I directory
Add the given directory to the list of directories searched for
compiled interface files (.cmi), compiled object code files
(.cmx), and libraries (.cmxa). By default, the current directory
is searched first, then the standard library directory. Directories
added with -I are searched after the current directory, in the order
in which they were given on the command line, but before the standard
If the given directory starts with +, it is taken relative to the
standard library directory. For instance, -I +labltk adds the
subdirectory labltk of the standard library to the search path.
- -inline n
Set aggressiveness of inlining to n, where n is a positive
integer. Specifying -inline 0 prevents all functions from being
inlined, except those whose body is smaller than the call site. Thus,
inlining causes no expansion in code size. The default aggressiveness,
-inline 1, allows slightly larger functions to be inlined, resulting
in a slight expansion in code size. Higher values for the -inline
option cause larger and larger functions to become candidate for
inlining, but can result in a serious increase in code size.
- -intf filename
Compile the file filename as an interface file, even if its
extension is not .mli.
- -intf-suffix string
Recognize file names ending with string as interface files
(instead of the default .mli).
Labels are not ignored in types, labels may be used in applications,
and labelled parameters can be given in any order. This is the default.
Force all modules contained in libraries to be linked in. If this
flag is not given, unreferenced modules are not linked in. When
building a library (-a flag), setting the -linkall flag forces all
subsequent links of programs involving that library to link all the
modules contained in the library.
Do not compile assertion checks. Note that the special form
assert false is always compiled because it is typed specially.
This flag has no effect when linking already-compiled files.
When linking .cmxa libraries, ignore -cclib and -ccopt
options potentially contained in the libraries (if these options were
given when building the libraries). This can be useful if a library
contains incorrect specifications of C libraries or C options; in this
case, during linking, set -noautolink and pass the correct C
libraries and options on the command line.
Allow the compiler to use some optimizations that are valid only for code
that is never dynlinked.
Ignore non-optional labels in types. Labels cannot be used in
applications, and parameter order becomes strict.
- -o exec-file
Specify the name of the output file produced by the linker. The
default output name is a.out under Unix and camlprog.exe under
Windows. If the -a option is given, specify the name of the library
produced. If the -pack option is given, specify the name of the
packed object file produced. If the -output-obj option is given,
specify the name of the output file produced. If the -shared option
is given, specify the name of plugin file produced.
Cause the linker to produce a C object file instead of an executable
file. This is useful to wrap Caml code as a C library,
callable from any C program. See chapter 18,
section 18.7.5. The name of the output object file is
camlprog.o by default; it can be set with the -o option.
This option can also be used to produce a compiled shared/dynamic
library (.so extension, .dll under Windows).
Generate extra code to write profile information when the program is
executed. The profile information can then be examined with the
analysis program gprof. (See chapter 17 for more
information on profiling.) The -p option must be given both at
compile-time and at link-time. Linking object files not compiled with
-p is possible, but results in less precise profiling.
Unix: See the Unix manual page for gprof(1) for more
information about the profiles.
Full support for gprof is only available for certain platforms
(currently: Intel x86/Linux and Alpha/Digital Unix).
On other platforms, the -p option will result in a less precise
profile (no call graph information, only a time profile).
The -p option does not work under Windows.
Build an object file (.cmx and .o/.obj files) and its associated compiled
interface (.cmi) that combines the .cmx object
files given on the command line, making them appear as sub-modules of
the output .cmx file. The name of the output .cmx file must be
given with the -o option. For instance,
ocamlopt -pack -o P.cmx A.cmx B.cmx C.cmx
generates compiled files P.cmx, P.o and P.cmi describing a
compilation unit having three sub-modules A, B and C,
corresponding to the contents of the object files A.cmx, B.cmx and
C.cmx. These contents can be referenced as P.A, P.B and P.C
in the remainder of the program.
The .cmx object files being combined must have been compiled with
the appropriate -for-pack option. In the example above,
A.cmx, B.cmx and C.cmx must have been compiled with
ocamlopt -for-pack P.
Multiple levels of packing can be achieved by combining -pack with
-for-pack. Consider the following example:
ocamlopt -for-pack P.Q -c A.ml
ocamlopt -pack -o Q.cmx -for-pack P A.cmx
ocamlopt -for-pack P -c B.ml
ocamlopt -pack -o P.cmx Q.cmx B.cmx
The resulting P.cmx object file has sub-modules P.Q, P.Q.A
- -pp command
Cause the compiler to call the given command as a preprocessor
for each source file. The output of command is redirected to
an intermediate file, which is compiled. If there are no compilation
errors, the intermediate file is deleted afterwards.
Check information path during type-checking, to make sure that all
types are derived in a principal way. All programs accepted in
-principal mode are also accepted in default mode with equivalent
types, but different binary signatures.
Allow arbitrary recursive types during type-checking. By default,
only recursive types where the recursion goes through an object type
are supported. Note that once you have created an interface using this
flag, you must use it again for all dependencies.
Keep the assembly code produced during the compilation. The assembly
code for the source file x.ml is saved in the file x.s.
Build a plugin (usually .cmxs) that can be dynamically loaded with
the Dynlink module. The name of the plugin must be
set with the -o option. A plugin can include a number of Caml
modules and libraries, and extra native objects (.o, .obj, .a,
.lib files). Building native plugins is only supported for some
operating system. Under some systems (currently,
only Linux AMD 64), all the Caml code linked in a plugin must have
been compiled without the -nodynlink flag. Some constraints might also
apply to the way the extra native objects have been compiled (under
Linux AMD 64, they must contain only position-independent code).
Compile or link multithreaded programs, in combination with the
system threads library described in chapter 24.
Turn bound checking off for array and string accesses (the v.(i) and
s.[i] constructs). Programs compiled with -unsafe are therefore
faster, but unsafe: anything can happen if the program accesses an
array or string outside of its bounds. Additionally, turn off the
check for zero divisor in integer division and modulus operations.
With -unsafe, an integer division (or modulus) by zero can halt the
program or continue with an unspecified result instead of raising a
Print the version number of the compiler and the location of the
standard library directory, then exit.
Print all external commands before they are executed, in particular
invocations of the assembler, C compiler, and linker.
Print the version number of the compiler in short form (e.g. 3.11.0),
- -w warning-list
Enable or disable warnings according to the argument
The argument is a set of letters. If a letter is
uppercase, it enables the corresponding warnings; lowercase disables
the warnings. The correspondence is the following:
The default setting is Aelz, enabling all warnings except fragile
pattern matchings, omitted labels, and innocuous unused variables.
Note that warnings F and S are not always triggered, depending on
the internals of the type checker.
- all warnings
- start of comments that look like mistakes
- use of deprecated features
- fragile pattern matchings (matchings that will remain
complete even if additional constructors are added to one of the
variant types matched)
- partially applied functions (expressions whose result has
function type and is ignored)
- omission of labels in applications
- overriding of methods
- missing cases in pattern matchings (i.e. partial matchings)
- expressions in the left-hand side of a sequence that don't
have type unit (and that are not functions, see F above)
- redundant cases in pattern matching (unused cases)
- overriding of instance variables
- unused variables that are bound with let or as, and don't
start with an underscore (_) character
- all other cases of unused variables that don't start with
an underscore (_) character
- warnings that don't fit in the above categories (except A)
- -warn-error warning-list
Turn the warnings indicated in the argument warning-list into
errors. The compiler will stop with an error when one of these
warnings is emitted. The warning-list has the same meaning as for
the -w option: an uppercase character turns the corresponding
warning into an error, a lowercase character leaves it as a warning.
The default setting is -warn-error a (none of the warnings is treated
as an error).
Print the location of the standard library, then exit.
- - file
Process file as a file name, even if it starts with a dash (-)
- -help or –help
Display a short usage summary and exit.
Options for the IA32 architecture
The IA32 code generator (Intel Pentium, AMD Athlon) supports the
following additional option:
- Use the IA32 instructions to compute
trigonometric and exponential functions, instead of calling the
corresponding library routines. The functions affected are:
atan, atan2, cos, log, log10, sin, sqrt and tan.
The resulting code runs faster, but the range of supported arguments
and the precision of the result can be reduced. In particular,
trigonometric operations cos, sin, tan have their range reduced to
Options for the AMD64 architecture
The AMD64 code generator (64-bit versions of Intel Pentium and AMD
Athlon) supports the following additional options:
- Generate position-independent machine code. This is
- Generate position-dependent machine code.
Options for the Sparc architecture
The Sparc code generator supports the following additional options:
- Generate SPARC version 8 code.
- Generate SPARC version 9 code.
The default is to generate code for SPARC version 7, which runs on all
11.3 Common errors
The error messages are almost identical to those of ocamlc.
See section 8.4.
11.4 Running executables produced by ocamlopt
Executables generated by ocamlopt are native, statically-linked,
stand-alone executable files that can be invoked directly. They do
not depend on the ocamlrun bytecode runtime system.
During execution of an ocamlopt-generated executable,
the following environment variables are also consulted:
- Same usage as in ocamlrun
(see section 10.2), except that option l
is ignored (the operating system's stack size limit
is used instead).
- If OCAMLRUNPARAM is not found in the
environment, then CAMLRUNPARAM will be used instead. If
CAMLRUNPARAM is not found, then the default values will be used.
11.5 Compatibility with the bytecode compiler
This section lists the known incompatibilities between the bytecode
compiler and the native-code compiler. Except on those points, the two
compilers should generate code that behave identically.
- Signals are detected only when the program performs an
allocation in the heap. That is, if a signal is delivered while in a
piece of code that does not allocate, its handler will not be called
until the next heap allocation.
- Stack overflow, typically caused by excessively deep recursion,
is handled in one of the following ways, depending on the
By raising a Stack_overflow exception, like the bytecode
compiler does. (IA32/Linux, AMD64/Linux, PowerPC/MacOSX, MS Windows
- By aborting the program on a “segmentation fault” signal.
(All other Unix systems.)
- By terminating the program silently.
(MS Windows 64 bits).
- On IA32 processors only (Intel Pentium, AMD Athlon, etc, in
32-bit mode), some intermediate results in floating-point computations
are kept in extended precision rather than being rounded to double
precision like the bytecode compiler always does. Floating-point
results can therefore differ between bytecode and native code; in
general, the results obtained with native code are “more exact”
(less affected by rounding errors and loss of precision).
- On the Alpha processor only, floating-point operations involving
infinite or denormalized numbers can abort the program on a
“floating-point exception” signal.