Frequently asked Questions about Caml

Contact the author Pierre.Weis@inria.fr

Created in October 1995.

Table of contents:

* Type checking

* Error message: a type is not compatible with itself ?
* How polymorphism is introduced ?
* What means '_a ?
* A type variable starts by '_ (an underscore) ?
* My function is not polymorphic (enough) ?
* My recursive function is not polymorphic (enough) ?
* My program is not polymorphic (enough) ?
* How to write a function with polymorphic arguments ?
* How to share a label between two different record types ?
*How to define two types that share constructor names ?

* Semantics

* Problems with data structures
- * Is it possible to manipulate bits ?
- * Is it possible to define data structures with fields that are mere bits ?
* Problems with equality
- * What's the difference between = and == ?
- * Is the equality predicate polymorphic ?
- * How to compare functional values ? (Invalid_argument "equal: functional value") ?
- * How to compare abstract data types values ?
* Problems about pattern matching
- * Is it possible to have several patterns for a single clause (``or'' patterns) ?
- * Are there interval patterns ?
- * How to write guards in patterns ?
- * Partial match with guards ?
* Is there lazy operators in Caml ?
* How to compute in a lazy way ?
* How to define a sentinel in my array ?
* Problems about streams
- * How to compare streams ?
- * My program is looping ?
* Problems about printing
- * How to pretty-print (or How to use ``format'' ?) ?
- * Toplevel output is truncated ?
- * When tracing, the output is mixed up ?
- * Output from the toplevel is mixed up ?
* Mutable values
- * References versus mutable fields ?
- * Are there pointers in Caml ?
* A system error occurs during my program execution (bus error, segmentation fault, general protection fault) ?

* Syntax

* Why is the syntax of Caml different from SML's one ?
* Non closed comment ?
* Why having two symbols := and <- to assign mutable values ?

* Libraries

* What is the equivalent of explode : string -> char list and the converse implode : char list -> string ?
* printf does not print everything ? parts of the format string are forgotten ?
* Partial applications of printf go wrong ?
* How to save and restore the graphical screen ?
* How to do graphics under Unix ?
* How to compile a stand-alone program with graphics ?
* Impossible to read back a value with input_value, there is a ``truncated object'' error ?
* How to recompile a bunch of Caml files ?
* How to write a makefile for Caml files ?
* Is there an interface between Caml and LaTeX ?
* Is there an interface between Caml and Lex ?
* Is there an interface between Caml and Yacc ?

* Efficiency ?

* Cost of basic operations ?
* Loops versus recursive functions ?
* How to write functions with more than one argument ?
* let f x = let rec f_rec ... versus let rec f x ?
* Why isn't my constant statically allocated ?
* Why isn't my constant array statically allocated ?
* Function inlining, boxing, gprof ?
* How to inline assembly code in programs ?

Type checking

Error message: a type is not compatible with itself ?

You may obtain the message:
This expression has type ``some type'' but is used with type ``the same ``some type''''. This may occur very often when using the interactive system.

Explanation: you defined two types with the same name. The compiler does not confuse the two types, but the types are evidently written the same. Consider for instance:

type counter = Counter of int;;

let x = Counter 1;;

Now suppose that the type counter is redefined, then we define a function that manipulates the new counter type. You cannot apply the function to x, since x has type (old) counter type:

type counter = Counter of int;;

let incr_counter c = match c with Counter x -> Counter (x + 1);;
incr_counter : counter -> counter = <fun>

incr_counter x;;
Interactive input:
>incr_counter x;;
>              ^
This expression has type counter,
but is used with type counter.

This phenomenon clearly appears when you load many times the same file into the interactive system, since each reloading redefines the types. When using the batch compiler, and when dependencies between compilation units are not properly handled, the same message appears when an implementation that ought to be recompiled is left unchanged.

Solution: quit your interactive system and reload your files in a new session. If it appears when compiling a file, you have to recompile all the files that your file depends upon.

How to introduce polymorphism ?

Polymorphism appears when we introduce type schemes, that is types with universally quantified type variables. These type scheme parameters are denoted by the prefix ' (for instance 'a). All type scheme parameters are universally quantified at the head of the schema: so 'a -> 'a means: For all types 'a, 'a -> 'a. This quantification of type parameters is implicit in Caml:

#map;;
- : ('a -> 'b) -> 'a list -> 'b list = <fun>

The type (schema) of map means For all types 'a, 'b, ('a -> 'b) -> 'a list -> 'b list.

The polymorphism is only introduced by the definition of names during a let construct (either local, or global).

The problem and its solution:

The original rule for the local definition

let name = expr1 in expr2

states that you have to type check expr1 first, then give name a type scheme (a polymorphic type) according to the type obtained for expr1, then type-check expr2, taking a copy of the type scheme of name for each occurrence of name. The schema of name is obtained by finding type parameters, that is all the type variables that have been introduced during the type-checking of expr1, and that remain unconstrained at the end of the type-checking of expr1. For instance:

let id = function x -> x in ...

id gets the type scheme: For all 'a. 'a -> 'a, and can be used with several incompatible types in the rest of the program, for instance with types int -> int and bool -> bool, if the in part is id 1; id true.

Unfortunately this simple rule is not correct in the presence of mutable values (such as references, arrays, or records with mutable fields). Consider:

let bad_ref = ref [] in ...

With the original let rule, bad_ref has the type schema ``For all 'a. 'a list ref''. So, the reference can be used with different incompatible types in the rest of the program, which is erroneous (lists must be homogeneous in Caml). For instance, if you use bad_ref with types int list ref and bool list ref, then you can write:

let bad_ref = ref [] in
    bad_ref := [1];
    if hd (!bad_ref) then ...

(bad_ref := [1] is correctly type-checked, since bad_ref can be used with type int list ref, hd (!bad_ref) is correctly type-checked, since bad_ref can be used with type type bool list ref. But this program leads to a fatal error at run-time, since after the assignment, bad_ref contains an integer list not a boolean list.

The new rule to define names:

In short, a secure type system for Caml must forbid the existence of polymorphic mutable values at run-time. This has been extensively studied, and many complex systems have been proposed. It appears now that a very simple modification of the original rule is almost completely satisfactory. When type-checking

let name = expr1 ...

The type of expr1 is generalized when expr1 is a function, an identifier or a constant. Otherwise the identifier name is not polymorphic (type variables are not generalized).

Which programs are now monomorphic ?

The new rule implies that if expr1 is a function application, then the identifier name is monomorphic. For instance

let bad_ref = ref [] in ...

bad_ref is not polymorphic, and that's what we want. In contrast:

let map_id = map (function x -> x) in ...

Since map_id is defined by an application, its type is not generalized and map_id cannot be used with several different types. So,

#let map_id = map (function x -> x) in
map_id [1]; map_id [true];;
Toplevel input:
>map_id [1]; map_id [true];;
>                   ^^^^^^
This expression has type bool list,
but is used with type int list.

is ill-typed (the first use of map_id, that is map_id [1] implies that map_id has type int list -> int list, and map_id [true] is ill-typed). The next section explains how to get a polymorphic definition for map_id, and how to get a polymorphic result when a polymorphic function is partially applied.

My function is not polymorphic (enough) ?

How to allow polymorphic definitions obtained via partial application ?

The more common case to get a ``not polymorphic enough'' definition is when defining a function via partial application of a general polymorphic function. In this case, you recover a fully polymorphic definition by clearly exhibiting the functionality to the type-checker: define the function with an explicit functional abstraction, that is, add a function construct or an extra parameter (this rewriting is known as eta-expansion):

let map_id l = map (function x -> x) l in ...

or alternatively

let map_id = function l -> map (function x -> x) l in ...

The new definition is semantically equivalent and can be assigned a polymorphic type scheme, since it is no more a function application.

A type variable starts with `_` ?

When an identifier is defined by an expression which cannot be generalized, the identifier can have a type with unknowns (that is type variables which cannot be generalized) which will be determined (that is assigned a type value) by the usage of the identifier in the rest of the program. These unknown types are special type variables, prefixed by an underscore (e.g. '_a). Beware, these variables stand for unknown types to be determined by the type checking process: that's why these variables may disappear, precisely because their type value has been discovered.

let map_id = map (function x -> x) in
map_id [1];;

After its definition map_id has type '_a list -> '_a list, where '_a means some type a. When typing map_id [1] this unknown type gets bound to int (and map_id has type int list -> int list for the rest of the program).

This phenomenon appears even more clearly if we break the let in construct in two phrases:

#let map_id = map (function x -> x);;
map_id : '_a list -> '_a list = <fun>
#map_id [1];;
- : int list = [1]

Now the unknown '_a in the type of map_id is determined, and map_id has a type without unknowns:

#map_id;;
- : int list -> int list = <fun>

The following section

My program is not polymorphic (enough)?

Semantically, the definition of an identifier by a let construct is equivalent to a function application. More exactly, let ident = e1 in e2 is equivalent to (function ident -> e2) e1. This means that the two programs produce the same results and the same effects.

However, the two programs are not completely equivalent as far as type checking is concerned: in effect, the let version may introduce some polymorphism, while the function application does not, and forces ident to be completely monomorphic when typing the expression e2. Thus:

#let id = function x -> x in id id;;
- : '_a -> '_a = <fun>

But:

#(function id -> id id) (function x -> x);;
Toplevel input:
>(function id -> id id) (function x -> x);;
>                   ^^
This expression has type 'a -> 'b,
but is used with type 'a.

How to write a function with polymorphic arguments ?

The argument of a function cannot be polymorphic inside the body of the function. Every occurrence of the argument constraints its type. Thus one cannot write a functional with a function as argument, if this function is used with two different types. For instance, you can write a function that applies a polymorphic identifier:

#let id x = x;;
id : 'a -> 'a = <fun>
#let phi () = id 1, id true;;
phi : unit -> int * bool = <fun>

But you cannot abstract this identifier as an argument of the phi function:

#let phi id () = id 1, id true;;
Toplevel input:
>let phi id () = id 1, id true;;
>                         ^^^^
This expression has type bool,
but is used with type int.

My function is not polymorphic (enough) ?

A recursive function cannot be used polymorphically in the body of its definition. Every monomorphic use of the function has an impact on the final type of the defined function:

#let rec black_hole x = black_hole 0;;
black_hole : int -> 'a = <fun>

The black_hole function is evidently looping, but the type that Caml has obtained is surprising: the argument x must be of type int, even though it is not used in the body of the function. Applying black_hole to an integer argument inside its body constraints its type.

This is also clear for mutually recursive functions. For instance:

#let rec identity x = x
and identity_int x = (identity x : int);;
identity : int -> int = <fun>
identity_int : int -> int = <fun>

The second definition of identity_int uses identity with type int -> int, and thus identity is monomorphic.

How to overcome this limitation ?

There is no completely satisfactory answer to this problem, but several partial solutions:

Separate the definitions in distinct phrases when these definitions are independent: this is the case in the preceding example.
Force the separation of recursive definitions by adding the monomorphic identifier as an extra parameter (you thus have to pass as many extra parameters as the number of distinct types of the identifier). For instance:
```
#let rec identity x = x
and id_int (id : 'a -> 'a) x = (id x : int);;
identity : 'a -> 'a = <fun>
id_int : (int -> int) -> int -> int = <fun>
#let identity_int x = id_int identity x;;
identity_int : int -> int = <fun>
```

Another solution to break those cycles is to define a reference for the monomorphic identifier: this identifier is thus predefined and can be used in the recursive definition. This is a manual encoding of good old ``forward'' definitions of imperative languages (as in Pascal). Don't forget to assign the right value to the reference before any use of any of the recursively defined functions!

#let forward_id = (ref (fun _ -> raise (Failure "undefined") : 'a -> 'a));;
forward_id : ('_a -> '_a) ref = ref <fun>
#let rec identity x = x
and id_int x = (!forward_id x : int);;
identity : 'a -> 'a = <fun>
id_int : int -> int = <fun>
#forward_id;;
- : ('_a -> int) ref = ref <fun>
#forward_id := identity;;
- : unit = ()
#forward_id;;
- : (int -> int) ref = ref <fun>

Now forward_id has the type of the corresponding occurrence of the identity function, while identity remains polymorphic.

#forward_id;;
- : (int -> int) ref = ref <fun>
#identity;;
- : 'a -> 'a = <fun>

When functions are really mutually recursive as in

#let rec f x = g 1
and g x = 1 + f x;;
f : int -> int = <fun>
g : int -> int = <fun>

To obtain a polymorphic typing for f and g you have to predefine the function g since it has a monomorphic occurrence (more exactly you have to define a version of the function g for each of the monomorphic occurrences)

#let forward_g_int = (ref (fun _ -> raise (Failure "undefined") : 'a -> 'b));;
forward_g_int : ('_a -> '_b) ref = ref <fun>
#let rec f x = !forward_g_int 1
and g x = 1 + f x;;
f : 'a -> int = <fun>
g : 'a -> int = <fun>
#forward_g_int := g;;
- : unit = ()

In some cases, some programs which are not typable by the Caml system used, can be rewritten to be typable. If we also use f with a monomorphic type:

let rec f x = g 1
and g x = 1 + f true + f x;;
Toplevel input:
>and g x = 1 + f true + f x;;
>    ^^^^^^^^^^^^^^^^^^^^^^
This expression has type bool -> int,
but is used with type int -> int.

To overcome the problem, we define an extra version for f for the type bool:

#let forward_f_bool = (ref (fun _ -> raise (Failure "undefined") : 'a -> 'b));;
forward_f_bool : ('_a -> '_b) ref = ref <fun>
#let forward_g_int = (ref (fun _ -> raise (Failure "undefined") : 'a -> 'b));;
forward_g_int : ('_a -> '_b) ref = ref <fun>
#let rec f x = !forward_g_int 1
and g x = 1 + !forward_f_bool true + f x;;
f : 'a -> int = <fun>
g : 'a -> int = <fun>
#forward_f_bool := f;;
- : unit = ()
#forward_g_int := g;;
- : unit = ()

If we use an identifier with several distinct monomorphic types, we predefine a version of the identifier for each type of use. For instance to define

let rec f x = f 1; f true; x;;

We predefine two versions for f, one for integers and one for booleans:

#let forward_f_bool = (ref (fun _ -> raise (Failure "undefined")));;
forward_f_bool : ('_a -> '_b) ref = ref <fun>
#let forward_f_int = (ref (fun _ -> raise (Failure "undefined")));;
forward_f_int : ('_a -> '_b) ref = ref <fun>
#let rec f x = !forward_f_int 1; !forward_f_bool true; x;;
f : 'a -> 'a = <fun>
#forward_f_bool := f;;
- : unit = ()
#forward_f_int := f;;
- : unit = ()

How to share a label between two different record types ?

When you define two types sharing a label name, the last defined type hides the labels of the first type. For instance:

type point_3d = {x : float; y : float; z : float};;
type point_2d = {x : float; y : float};;

# {x = 10.; y = 20.; z = 30.};;
The record field label z belongs to the type point_3d
but is here mixed with labels of type point_2d

The simplest way to overcome this problem is simply ... to use different names! For instance

type point3d = {x3d : float; y3d : float; z3d : float};;
type point2d = {x2d : float; y2d : float};;

Another solution is to use modules:

module D3 = struct
 type point = {x : float; y : float; z : float}
end;;

module D2 = struct
 type point = {x : float; y : float}
end;;

This way labels can be fully qualified as D3.x D2.x:

# {D3.x = 10.; D3.y = 20.; D3.z = 30.};;
- : D3.point = {D3.x = 10.000000; D3.y = 20.000000; D3.z = 30.000000}
# {D2.x = 10.; D2.y = 20.};;
- : D2.point = {D2.x = 10.000000; D2.y = 20.000000}

You can also use objects that provide overloading on method names:

class point_3d ~x ~y ~z = object
  method x : float = x
  method y : float = y
  method z : float = z
end;;

class point_2d ~x ~y = object
  method x : float = x
  method y : float = y
end;;

Note that objects provide you more than overloading: you can define truly polymorphic functions, working on both point_3d and point_2d, and you can even coerce a point_3d to a point_2d.

How to define two types that share constructor names ?

Generally speaking you cannot. As for all other names, you must use distinct name constructors. However, you can define the two types in two different name spaces, i.e. into two different modules. As for labels discussed above, you obtain constructors that can be qualified by their module names.

Alternatively you can use polymorphic variants, i.e. constructors that are, in some sense, predefined, since they are not defined by a type definition. For instance:

type ids = [ `Name | `Val ];;
type person = [ `Name of string ];;
# let (f : person -> string) = function `Name s -> s;;
val f : person -> string = <fun>
# let (is_name : ids -> bool) = function `Name -> true | _ -> false;;
val is_name : ids -> bool = <fun>

Semantics

Problems with data structures ?

Is it possible to manipulate bits ?

Yes, Caml offers functions that manipulate bits of the representation of integer numbers:

land computes the logical ``and'' of the corresponding bits of two integer numbers.
lor computes the logical ``or'' of the corresponding bits of two integer numbers.
lsl shifts to the left the bits of an integer (hence n lsl 1 multiplies n by 2).
asr shifts to the right the bits of an integer, taking care of the sign bit of the number (hence n asr 1 divides n by 2, maintaining in the result the sign of n).
lsr shifts to the right the bits of an integer, but the sign bit of the argument is not preserved (hence n lsl 1 divides n by 2, with no care for the sign of n).
lnot turns each bit to its logical negation.

Reading the bits of a number

You can directly input the bits of an integer by prefixing the list of its bits by 0b:

# 0b101;;
- : int = 5

Accessing the bits of a number

Using bitwise operators, you can access and set the bits of a number:

let nth_bit n i = (n lsr i) land 1;;
val nth_bit : int -> int -> int = <fun>

let set_nth_bit n i = n lor (1 lsl i);;
val set_nth_bit : int -> int -> int = <fun>

let clear_nth_bit n i = n land (lnot ((1 lsl i)));;
val clear_nth_bit : int -> int -> int = <fun>

Using those operators and character strings, you can also define bit arrays.

Is it possible to define data structures with fields that are mere bits ?

In C language, we can define a variable size as a unit of bits.
For example,
unsigned int a:1 ;,
unsigned int b:4 ; ....
The symbol ":" is used to set up the number of bits.
Can I define OCaml-variables in the same way?

No, you can't. The smallest unit for data representations is the machine word. You can also work at the level of bytes by using strings as byte arrays.

Because I want to make some data header by using bits as little as possible.
The style that I imagine is...
       type hd = { flag : int_1 ; on_off : int_1 ; seq_num : int_4 } ;;
       let header = { flag = 1 ; on_off = 0 ; seq_num = 0101 } ;;

The best you could do is represent hd as an integer and define constructor and accessors functions yourself:

type hd = int;;

let make_hd flag on_off seq_num =
 flag lor (on_off lsl 1) lor (seq_num lsl 2);;

let flag_hd h = h land 1;;

let on_off_hd h = (h lsr 1) land 1;;

let seq_num_hd h = (h lsr 2) land 0xF;;

let header = make_hd 1 0 0b0101;;

Is the equality function polymorphic ?

There are two equality functions in Caml: = and == that are polymorphic with type: 'a -> 'a -> bool. These two functions are different and must not be confused:

== tests the identity of its arguments, by testing the physical equality of memory representations. == is very efficient since this predicate generally corresponds to a single machine instruction. == can be applied to any couple of values with the same type, and never fails: for all the values that are not allocated the bits of their representations are checked for identity, and for all the values that are allocated in memory, the predicate tests if the two values are stored in the same memory location.
= tests semantic equality: are the arguments isomorphic ? To test it, = explores in parallel the two values, until it finds a difference, or the exploration ends.
That's why = may loop in case of cyclic values.

In short: == is

efficient: it is implemented as a single machine instruction.
general: it can be apply to any pair of values, and never fails.
semantically sound when it returns true, but may be semantically unsound when it returns false, since two values may be not physically the same but semantically equal (e.g. two character strings with the same characters are not ==).
more difficult to use than =: you must know something about the internal representation of values to use it properly. (For instance, two floating point numbers with the same digits will not be == in most Caml systems.)

In contrast, = is

less efficient: it is implemented as a recursive descent into its arguments.
less general: it cannot be apply to any pair of values, in particular it may fails when applied to functions.
semantically sound when it returns false, but may be semantically unsound when it returns true, since two values with the same contents may be semantically distinct (e.g. values that contain mutable parts).
may fail to terminate: in case of cyclic data structures = may loop for ever (even if it could be conceivable to design a more clever algorithm that may succeed more often).
is definitively the Caml general purpose equality predicate: it implements a reasonably intuitive notion of value equality.

Examples:
Strings are allocated in the memory, so the semantic difference between = and == is evident:

#let s = "ok";;
s : string = "ok"
#s == s;;
- : bool = true
#s == "ok";;
- : bool = false
#s = "ok";;
- : bool = true
#"ok" == "ok";;
- : bool = false
#"ok" = "ok";;
- : bool = true

Comparing the behavior of predicates over mutable values is also interesting: == is more accurate than =, since = does not always answer the same for the same inputs.

#let x = ref 1 and y = ref 1;;
x : int ref = ref 1
y : int ref = ref 1
#x = y;;
- : bool = true
#x == y;;
- : bool = false
#x := 2;;
- : unit = ()
#x = y;;
- : bool = false
#x == y;;
- : bool = false

For floating point numbers == is not appropriate:

1.0 == 1.0;;
- : bool = false

It means that floating point numbers are allocated in this case, hence not ==. For floating points numbers, you must use =

For lists == comparison is also strange:

#[1] == [1];;
- : bool = false
#[1] = [1];;
- : bool = true

Roughly speaking, it is always the same for values belonging to non mutable data structures: comparison using the = predicate leads to the intuitively sound results.

Equality and abstract data types

Values that belong to an abstract data type cannot be compared by the polymorphic predicates = or ==, but must be compared by a specific equality function (that should be given with the definition of the abstract data type): in effect the results returned by the predefined equality predicates have no meaning for abstract data types, since the representation of values from an abstract data type have no relationship with their semantic meaning. For instance, consider the equality test for rational numbers implemented as a product type; the rational numbers 1/1 and 2/2 are semantically equal, but the predicate = detects that their representations are different:

#type ratio = {numerator : int; denominator : int};;
Type ratio defined.
#{numerator = 1; denominator = 1} =
 {numerator = 2; denominator = 2};;
- : bool = false

Use the comparison defined by the module that defines the abstract data type (example). If this comparison is not provided by the module, there is no reliable mean to compare values of this type.

How to compare streams ?

A common error is to compare streams with the = predicate: since the stream type is an abstract data type, comparison with = is meaningless (see details).

Invalid_argument "equal: functional value" ?

The functional values belong to an abstract data type: as a consequence you cannot compare functions with the predicates = or == (see details). Moreover there is no predefined predicate to compare functional values, since there is no way to prove that two functions are equal (this is an undecidable problem). In some case = detects these incorrect functional arguments and fails, raising the exception Invalid_argument.

#let id = function x -> x;;
id : 'a -> 'a = <fun>
#id = (function z ->
        let x = 1 in
        if x = 2 then z else id z);;
Uncaught exception: Invalid_argument "equal: functional value"

When faced with functional values, the == predicate tests the identity of the representations of values (as usual): if == returns true the functions are indeed equal; however if == returns false, this result demonstrates the physical inequality of the two functions, and this is by no means a proof that the functions are semantically different.

#(function x -> x) == (function x -> x);;
- : bool = false
#let f = function x -> x in f == f;;
- : bool = true

Equality defined for values of an abstract data type

The predefined equality predicate = fails as soon as it encounters a functional value to be tested during its recursive descent into values. For values of an abstract data types, this error can be surprising if you are not aware that the implementation of the type at hand uses functions to represent the data. For instance the module set from the standard Caml Light library, provides an implementation of sets through the abstract data type set__t. The representation of a set as a value of type set__t embodies the function used to compare the elements of the set. That's why any comparison of sets using the predicate = leads to an error:

#let rec set_of_list = function
  | x :: l -> set__add x (set_of_list l)
  | [] -> set__empty (prefix -);;
set_of_list : int list -> int set__t = <fun>

#set_of_list [0] = set_of_list [0];;
Uncaught exception: Invalid_argument "equal: functional value"

The solution is to use the corresponding predicate for sets provided by the set module:

#set__equal (set_of_list [0]) (set_of_list [0]);;
- : bool = true

Problems about pattern matching

Is it possible to have several patterns for a single clause (``or'' patterns) ?

Yes, you just list the patterns | separated. The clause is selected if one of the patterns applies (hence the ``or'' patterns name). For example:

#let rec fib = function
 | 0 | 1 -> 1
 | n -> fib (n - 1) + fib (n - 2);;

``or'' patterns are restricted: they cannot define variables. You must use constants, functional value constructors, or underscore wildcards only. That way, variables used in the expression part of the clause are always defined whatever the selected pattern could be.

Are there interval patterns ?

These patterns only concern character intervals.

let is_upper = function
 | `A` .. `Z` -> true
 | _ -> false;;

The first pattern apply to any character from `A` to `Z`.

How to write guards in patterns ?

Guards allow to mix structural pattern matching to conditional tests. A pattern matching clause with a guard, applied only if the boolean condition in the guard evaluates to true. These kind of test cannot be expressed in terms of pattern matching, since they means arbitrary computation with the values manipulated by the program.

A typical example is to test some condition before applying a clause, or testing equality of two variables bound into the pattern (that allows to test the equality of some components the value at hand during the pattern matching).
Guards are introduced by the keyword when followed by an expression (that must evaluate to a boolean) to be tested.

For example:

let rec boolean_or = function
 | x, y when x = y -> x
 | true, _ -> true
 | x, y -> boolean_or (y, x);;

The first clause applies only if the pair (x, y) has two equal components (that is x = y).

Partial match with guards ?

Guards add an extra phenomenon for the pattern matching exhaustivity test, since the compiler treats guarded clauses as if they always can fail, due to a failure of the guard (which means the guard may evaluate to false).
Here is a typical example:

I'm defining a simple function : two cases, one for n >= 0 and the other for n < 0. Yet the compiler warns for a ``non exhaustive pattern matching''. Why ? Is there some case where the pattern matching indeed fails ?

let f = function
 | n when n >= 0 -> n
 | n when n < 0 -> n + 1;;
Toplevel input:
>........function
> | n when n >= 0 -> n
> | n when n < 0 -> n + 1..
Warning: this matching is not exhaustive.
f : int -> int = <fun>

The pattern matching is exhaustive. But the compiler is not able to (statically) prove it. Since there is no restriction on the complexity of expressions in guards, there is no means to figure out which result they may produce. In the example at hand, the compiler should prove that the two patterns (guards) are mutually exclusive, and this is not possible in general.

Is it mandatory to add an extraneous (useless) clause as a catch-all ? Something like:
let f = function
 | n when n >= 0 -> n
 | n when n < 0 -> n + 1
 | _ -> failwith "impossible";;

No, to help the compiler, you just have to make explicit in your code that you know that the second clause is always verified if the first fails. The best way is just to cancel the second guard, writing:

let f = function
 | n when n >= 0 -> n
 | n -> n + 1;;

A more elegant solution is to comment out the guard (the human reader may be happy to have it):

let f = function
 | n when n >= 0 -> n
 | n (* n < 0 *) -> n + 1;;

For such simple cases, one may think to add a special case to the compiler to detect exhaustivity in the presence of guards. Unfortunately this cannot be generalized, for complex cases, as this one:

let rec f = function
 | n when n >= 0 -> n
 | n when - f (- n) < 0 -> n + 1;;

Here, pattern matching is exhaustive, but this really (a bit) tricky to prove!

Note that it is intrinsically complex: even a simple rule as ``in case of testing the same expression, with two mutually exclusive predicates, then the second guard always apply'' is false. Consider for instance:

let rec f = function
 | n when g (n + 1) >= 0 -> n
 | n when g (n + 1) < 0 -> n + 1

and g =
 let alea = ref true in
 (function n ->
    alea := not !alea;
    if !alea then -1 else 1);;

Even if the same g (n + 1) expression is compared to be greater than 0 in the first rule and less than 0 in the second rule, the pattern matching of f is far from exhaustive, it is even almost completely non exhaustive!

How to define a sentinel in my array ?

An alternative way of asking the same question:
I need an ``impossible'' value, that can be used as a ``sentinel'', since my function may compute results that have no meaning, in case of strange arguments. Unfortunately this is not elegant and suppresses polymorphism ...

First of all there is no ``impossible value'' in Caml. The Caml way to solve the problem is to explicitly define the impossible value using a new type with the two alternatives:

type 'a guarded_value =
   | Impossible
   | Possible of 'a;;

This way you can explicitly test if a value is ``impossible'' or not. By the way the guarded_value type is isomorphic to the ``option'' type, used for optional values:

type 'a option =
   | None
   | Some of 'a;;

This type serves to initiate data that contains parts that cannot be computed when the values are defined: the corresponding part (probably a mutable field in a record) is initiated with None and updated with the right value when possible.

How to define a polymorphic sentinel ?

Another variant of the same question is:
``I thought to define a sum type 'a | Failed, and then every function could return an "union" of its normal result and Failed ?''

Caml provides a clean way to obtain this behavior: the exception mechanism. A function may return a regular result when it is possible, and raise an exception otherwise. Many predefined functions behave like this, and fail raising either the exception Not_found (for functions that try to find something), or the exception Failure if nothing is more appropriate.

For instance: working with maps on sets (finite mappings from a set to another one). Consider the predicate that given (x : 'a) and (y : 'a) returns true if x and y have the same image and false otherwise. We use the predefined function map__find and carefully catch its failures:

let map_compare cle1 cle2 table =
  try
    let image_cle1 = map__find cle1 table in
    let image_cle2 = map__find cle2 table in
    image_cle1 = image_cle2
  with Not_found -> false;;

Is there lazy operators in Caml ?

The first lazy operator is obviously the conditional, the ``if then else'' construct, since the expressions in the ``then'' and ``else'' part are not systematically evaluated.

Furthermore, boolean operators are defined using ``if then else'':

e1 && e2 means if e1 then e2 else false
e1 || e2 means if e1 then true else e2

thus, these operators inherit a lazy behavior from the conditional: if e1 is false then e1 && e2 evaluates to false, and e2 is not evaluated. There is no Caml function having the same lazy behavior, since the arguments of a Caml function are always fully evaluated before the function is called (call by value).

So, there is no Caml function f such that f e1 e2 is equivalent to e1 && e2. However, the prefix operator corresponding to && still exists. In effect, for all other predefined infixes, the prefix function f corresponding to operator op is defined as:

let f e1 e2 = e1 op e2;;

For boolean operators this rule still applies, and we get:

let (prefix &&) e1 e2 = e1 && e2;;

which evidently means:

let (prefix &&) e1 e2 = if e1 then e2 else false;;

So, be aware that, due to this lazy behavior of boolean operators, (prefix &&) e1 e2 is not absolutely equivalent to

e1
&& e2

: (prefix &&) e1 e2 always evaluates its two arguments, and it is the strict version of boolean &&.

This discussion vanishes if the predefined boolean operators have been rebound to a regular Caml function: in this case infix and prefix versions of the operators are completely equivalent.

How to define lazy lists ?

(* Lazy lists *)
type 't llist =
   | Nil
   | Cons of 't cell
and 't cell = { hd : 't ; tl : unit -> 't llist}
;;

let rec lmap f = function
  | Nil -> Nil
  | Cons {hd = x; tl = t} ->
     Cons {hd = f x; tl = (fun () -> lmap f (t ()))};;

let rec ldo_list f n = function
  | Nil -> ()
  | Cons {hd = x; tl = t} ->
     if n > 0 then begin f x; ldo_list f (n - 1) (t ()) end;;

let rec nat = Cons {hd = 0; tl = (fun () -> lmap succ nat)};;

ldo_list print_int 10 nat;;

nat;;

(* With sharing of the computation of the tail of the list *)
type 't llist =
   | Nil
   | Cons of 't cell
and 't cell = { hd : 't ; mutable tl : unit -> 't llist}
;;

let rec lmap f = function
  | Nil -> Nil
  | Cons ({hd = x; tl = t} as cell) ->
     Cons {hd = f x;
           tl = (function () ->
                  let tail = t () in
                  cell.tl <- (fun () -> tail);
                  lmap f tail)};;

let rec ldo_list f n = function
  | Nil -> ()
  | Cons ({hd = x; tl = t} as cell) ->
     if n > 0 then begin
      f x;
      let tail = t () in
      cell.tl <- (fun () -> tail);
      ldo_list f (n - 1) tail end;;

let rec nat = Cons {hd = 0; tl = (fun () -> lmap succ nat)};;

ldo_list print_int 5 nat;;
nat;;

With explicit thunks:

type 'a frozen =
   | Freeze of (unit -> 'a)
   | Val of 'a;;

type 'a llist =
   | Nil
   | Cons of 'a cell

and 'a cell = { hd : 'a ; mutable tl : 'a llist frozen}
;;

let rec lmap f = function
  | Nil -> Nil
  | Cons {hd = x; tl = Val l} ->
     Cons {hd = f x; tl = Freeze (function () -> lmap f l)}
  | Cons ({hd = x; tl = Freeze th} as cell) ->
     let thunk () =  
       let tail = th () in cell.tl <- Val tail; lmap f tail in
     Cons {hd = f x; tl = Freeze thunk};;

let rec nat = Cons {hd = 0; tl = Freeze (fun () -> lmap succ nat)};;

let rec ldo_list f n = function
  | Nil -> ()
  | Cons {hd = x; tl = Val l} ->
     if n > 0 then begin
      f x;
      ldo_list f (n - 1) l end
  | Cons ({hd = x; tl = Freeze th} as cell) ->
     if n > 0 then begin
      f x;
      let tail = th () in
      cell.tl <- Val tail;
      ldo_list f (n - 1) tail end;;

(* Example *)
#let rec nat = Cons {hd = 0; tl = Freeze (fun () -> lmap succ nat)};;
nat : int llist = Cons {hd=0; tl=Freeze <fun>}
#ldo_list print_int 5 nat;;
01234- : unit = ()
#nat;;
- : int llist =
 Cons
  {hd=0;
   tl=
    Val
     (Cons
       {hd=1;
        tl=
         Val
          (Cons
            {hd=2;
             tl=Val (Cons {hd=3; tl=Val (Cons {hd=4; tl=Freeze <fun>})})})})}

Problems with streams ?

My program is looping ?
Beside the common programming errors, functions that manipulate streams may loop because they are not enough ``lazy'' to evaluate their stream arguments. This loss of laziness is mainly due to two reasons: a let construct that is not protected by a stream construction or else a pattern matching that forces to complete the computation of the stream argument before being able to select the right clause of the pattern matching. Before examining these two cases, let me recall the main rule of stream computation:
Stream construction is lazy. In contrast, the pattern matching of streams forces the evaluation of streams in order to select the right clause of the pattern matching (recall that this pattern matching physically consumes the elements of the stream).
This lazy evaluation mechanism explains that you may build (potentially) infinite streams, such as the sequence of integers following a given integer:
```
#let rec following n = [< 'n; following (n + 1) >];;
following : int -> int stream = <fun>
```
The recursive call following (n + 1) is lazily evaluated, that is to say, it is delayed until the second element of the stream is accessed.
Analogously, you may write a function that doubles the elements of a stream, using pattern matching on the first element of the stream, and lazy construction of the rest of the result:
```
#let rec double = function
  [< 'n; s >] -> [< '(2 * n); double s >];;
double : int stream -> int stream = <fun>
```
Note that a function defined on streams may loop via a recursive call written in a pattern. In fact, such a recursive call may imply to compute the whole stream to verify that it matches the pattern. Consider, a rewriting of the function double, where the recursive call is now inside the pattern and produces the tail of the result:
```
#let rec loop = function
  [< 'n; loop s >] -> [< '(2 * n); s >];;
loop : int stream -> int stream = <fun>
```
The recursive call on the tail of the stream argument requires the computation of the entire stream to prove that loop may be applied on the tail of its stream argument; the recursive call now leads to the pattern matching with the same pattern, that implies again the computation of another element, and thus the function loops if its stream argument is infinite, or fails if the stream is finite (since the recursive calls will ultimately end when the stream is empty, and the function cannot be applied to an empty stream):
```
#loop [< '1 >];;
Uncaught exception: Parse_error
```

Problems with printing

Toplevel output is truncated ?

The toplevel pretty-printer must print cyclic values. So, the pretty-printer uses a truncation mechanism to ensure termination. This truncation involves two counters: print_depth measures nesting depth of values to print (for instance an element within a list within a list has depth 2); print_length measures the number of node printed so far (for instance each element printed in a list increments print_length). The default value of print_depth is 100, the default value of print_length is 300. You can change these values using the functions set_print_depth and set_print_length.

The output from the toplevel is all mixed up ?

If you use the directive install_printer to create a pretty-printer for your values, then you must use the formatting primitives and printing functions from the format module to write your pretty-printer. Otherwise, output from your pretty-printer and from the Caml system will not be synchronized. In effect, the system uses the ``format'' module and its pretty-printing engine, which delays the actual output of the material, since the pretty-printing engine records the printing commands, then decides where lines have to be cut, and then really prints the enqueued items.
If you use low level primitives, you may print before the pretty-printing engine, although your printing commands have been executed after the printing commands sent to the pretty-printing engine. Suppose for instance that I declare the io__print_string (that is the low level print_string primitive from the io module), as the default printer for character strings. Then strings will be output too early, before the toplevel output starts:

#"coucou";;
- : string = "coucou"
#install_printer "io__print_string";;
- : unit = ()
#"coucou";;
coucou- : string =

When tracing, the output is mixed up ?

If you use the install_printer directive to create a pretty-printer for your values, then you must write it using the format library. This way, you properly synchronize your output and the output from the trace system, as in the case of interactive use of the Caml system (see a detailed explanation or an example using the pretty-printer when tracing).

Mutable values

References versus mutable fields

Since Caml offers references, and more generally mutable values such as vectors, one could ask what can be the usefulness of mutable fields in records, since instead of a mutable field one can define an immutable field that contains a mutable value (say a reference). This is almost true, but the subtle difference makes the rule of thumb that you generally have to prefer a mutable field (in particular because this save an extra indirection (an extra memory access) each time you access the value). However, the semantic difference is subtile, and obviously involves sharing: the reference that is stored in the immutable field of a record may be physically shared by several records. Each mutation of this reference cell is transparently recorded by all these records. If this semantics was implemented using a mutable record field, the modification would have been to be done for all the records.
In practice, this behavior is rarely the one you want, it is likely to be a bug. As an example, consider the account type defined with references as:

#type account = {number : int; balance : float ref};;
Type account defined.
#let balance1 = ref 0.0;;
balance1 : float ref = ref 0.0
#let account1 = {number = 1; balance = balance1}
and account2= {number = 2; balance = balance1};;
account1 : account = {number = 1; balance = ref 0.0}
account2 : account = {number = 2; balance = ref 0.0}

Accounts account1 and account2 now share the same balance: every assignment to the balance of account1 will change the balance of account2, and although this can be expected from the definition of the two accounts, this behavior sounds like a real bug.

#account1.balance := 1.0;;
- : unit = ()
#account2;;
- : account = {number = 2; balance = ref 1.0}

Definition of mutable record fields that contain mutable values is not forbidden, but examples that needs this kind of hairy definitions are not common (see this example).

A system error occurs during my program execution (bus error, segmentation fault, general protection fault) ?

Normally this should not occur; however it still occurs in case of weird circumstances, due to a programming error or to a particular OS or processor:

when accessing out of range in a vector or string, when the compilers does not generate bound checking (due to explicit user's request),
when attempting to perform an illegal floating point operation (division by 0), on some processors running under some OSes (e.g. alpha processor under True64 Unix),
when the program is looping and consumes the whole memory for the execution stack, when the overflow check cannot detect this situation (for instance, in case of a native code program running under Unix),
when using ``magic coercion'' from the Obj module,
in case of erroneous usage (i.e. ill-typed usage) of marshaling primitives output_value, input_value, etc,
when calling user's defined external functions (for instance written in C) and when the interface is wrong (since effective types of primitives are not the types declared to the Caml compiler).

Very often, you should use the Caml debugger to precisely find where in your code the error is occurring, and then correct the program.

Syntax

Why is the syntax of Caml different from SML's one ?

The main reason is historical: Caml was invented before SML.

The Caml language is issued from the original ML language, invented by Robin Milner in 1978, and developed jointly at INRIA from 1981. The first Caml compiler was written in 1984, and distributed from 1985. Standard ML was not yet invented, and Caml's syntax was similar to the syntax of ML V6.2, whose compiler was used to compile the first Caml compiler. When the syntax of Standard ML has been defined, Caml kept its own syntax, for several reasons. The first reason is questionable and subjective: we thought that SML's syntax was not a major improvement of the original ML syntax (hence of Caml's one). The second, more fundamental, reason is that we thought the language not to be mature enough at that time to be completely defined and fixed by a standard; we are still doing some progress on the understanding of some features of ML, and these progress may imply some important modifications of the language (let's cite the type checking of imperative features or the semantics and type checking of modules). In addition, we wanted to be free to add some new constructs (and indeed we did it, e.g. ``for''loops, ``try with'' and ``match with'', character constants, access to vectors and strings, ``mutable'' annotations for records, ``when'' clauses in pattern matching). This relative freedom with respect to the syntax allows the emergence of new variants of Caml: for instance the Caml Light and Objective Caml systems have their own extensions to the core syntax of Caml.

Hence Caml has its own syntax.

Non closed comment ?

To ensure that any piece of Caml program can be commented out, Caml comments are analyzed during the lexical phase of parsing. This means that the lexical conventions of Caml must be carefully followed inside comments. The main problem is strings: a double quote inside a comment is understood as the beginning of a string, and the lexical analyser will look forward to find a closing double quote for the string. For instance:

(* Nai"ve sorting algorithm *)

is a ``never ended'' comment. In contrast the following text is a single comment:

(* And what happens when s contains the two characters
   * followed by ) ?
let comment_string s =
  print_string "(*"; print_string s; print_string "*)";;
*)

Why having two symbols `:=` and `<-` to assign mutable values ?

The reason is semantic ambiguity, when one mixes references and mutable record fields:

type crazy = {mutable r : 'a ref};;

let x = {r = ref 0};;

Now consider the two assignments:

x.r := expression
x.r <- expression

(1) clearly means change the contents of the reference (expression must have type int).
(2) means that the reference cell contained in the r field of the record must be changed by a new one (hence expression must have type int ref).
If we had the same syntax to mean the two operations, then an ambiguity would exist...

Libraries

What is the equivalent of `explode : string -> char list` and the converse `implode : char list -> string` ?

There is no equivalent primitives in the library. You can use the following :

In Caml Light

let explode s =
  let rec exp i l =
   if i < 0 then l else exp (i - 1) (s.[i] :: l) in
  exp (string_length s - 1) [];;

let implode l =
 let res = create_string (list_length l) in
 let rec imp i = function
  | [] -> res
  | c :: l -> res.[i] <- c; imp (i + 1) l in
 imp 0 l;;

In Objective Caml:

let explode s =
  let rec exp i l =
    if i < 0 then l else exp (i - 1) (s.[i] :: l) in
  exp (String.length s - 1) [];;

let implode l =
  let res = String.create (List.length l) in
  let rec imp i = function
  | [] -> res
  | c :: l -> res.[i] <- c; imp (i + 1) l in
  imp 0 l;;

`printf` does not print everything ? parts of the format string are forgotten ?

When I mix Printf.printf and List.iter, I get strange results:

if I change (fun i -> Printf.printf " ** %i" i) into (Printf.printf " ** %i) as the argument of List.iter, the beginning of the format string (before %i) is forgotten (except at the first iteration).

I should obtain the same result with the two following definitions:

    let no_problem l =
      List.iter (fun i -> Printf.printf " ** %i" i) l;
      print_newline ();;

    let problem l =
      List.iter (Printf.printf " ** %i") l;
      print_newline ();;

However,

    let l = [0; 1; 2; 3; 4] in
    no_problem l;
    problem l;;

     ** 0 ** 1 ** 2 ** 3 ** 4
     ** 01234

This behavior is normal, even if it can be surprising at first glance. The statement ``It should be the same'' is based on the assumption that

fun i -> Printf.printf " ** %i" i

and

Printf.printf " ** %i"

are two equivalent expressions, since it is the only difference between the two programs, problem and no_problem. At first glance, you can think that these two expressions could be equivalent, but in fact they are not.
Observe what happens, if we bind these two expressions to f and g:

let f = fun i -> Printf.printf " ** %i" i;;
val f : int -> unit = <fun>

let g = Printf.printf " ** %i";;
 ** val g : int -> unit = <fun>

As we expect, the definition of f does not output anything; in contrast, the definition of g prints something!
The phenomenon is clear, if you know the semantics of printf:

Printf.printf " ** %i" is a shorthand for: print_string " ** "; fun i -> print_int i.

Now, the definition of g is equivalent to

let g = print_string " ** "; fun i -> print_int i;;

and so it prints the string " ** " before binding g to the function fun i -> Printf.printf "%i".

The same scheme (recursively applied to the format string) gives you the meaning of complex applications of printf to complex format string.
Now, we can see that the expression printf "1: %i 2: %s 3: %f." is in fact equivalent to

print_string "1: ";
(fun i ->
   print_int i;
   print_string " 2: ";
   (fun s ->
      print_string s;
      print_string " 3: ";
      (fun f ->
         print_float f;
         print_string ".")))

This explicit meaning of printf explains the behavior of the following definitions:

let h = Printf.printf "1: %i 2: %s 3: %f." 1;;
1: 1 2: val h : string -> float -> unit = <fun>

and

let i = h "two";;
two 3: val i : float -> unit = <fun>

i 4.0;;
4.000000.- : unit = ()

In conclusion: avoid partial applications of printf (and use the explicit functional form fun x -> printf "..." x when passing printf "..." as argument to some other function); you can evidently bypass this warning and use partial applications of printf, if only you are a ``printf expert'' that perfectly knows what he is doing and bets that the reader of the program would also understand the complex machinery at work in this part of the code!

How to save and restore the graphical screen ?

You must create an image and save it on the disk using output_value. For instance:

(* Write an image to the file ``file_name'' *)
let output_image im file_name =
 let oc = open_out_bin file_name in
 let imv = dump_image im in
 output_value oc imv;
 close_out oc;;

(* Read an image from the file ``file_name'' *)
let input_image file_name =
 let ic = open_in_bin file_name in
 let im = make_image (input_value ic) in
 close_in ic;
 im;;

(* Save the actual contents of the screen. *)
let save_screen file_name =
 let im = get_image 0 0 (size_x ()) (size_y ()) in
 output_image im file_name;;

(* Restore the contents of the screen, reading it from a file. *)
let restore_screen file_name =
 let im = input_image file_name in
 draw_image im 0 0;;

Note that the size of the image depends on the size of the screen, but it could be as large as several megabytes.

How to do graphics under Unix ?

As for all operating systems, you must ask for functions from the ``graphics'' library to be available, typing in your program:

#open "graphics";;

However, in the regular interactive system, the library has not been linked, and thus, any call to the library functions fails during the linking phase, when the compiler complains that no ``graphics'' module has been linked.

So, the problem is to call an interactive system with the library already linked in it. Fortunately, this has already been done when installing the graphical library. You just have to call this special interactive system, using the command:

$ camllight camlgraph

How to compile a stand-alone program with graphics ?

As for the interactive system case (see above), you have to link the graphical library with your program. This means to use the ``custom'' mode of the compiler, to get the C primitives of the library. Use:

        camlc -custom <other options> \
              unix.zo graphics.zo <other .zo and .ml files> \
              -lgraph -lunix -lX11

Where

<other options> is any option of the Caml compiler, for instance fixing the name of the stand-alone program ( -o program).
<other .zo files> means the set of compiled files used by the program.

Note: the order of presentation of options is relevant, casual users should not modify it.

Impossible to read back a value using `input_value`, the value is ``truncated'' ?

When you manipulate Caml values on disk, the files you write and read are not text files, but binary files. That means that the contents of these files are arbitrary characters, carriage return characters should not be converted, and there is no character to indicate the end of the file. So, you must open these files with the special primitives open_in_bin and open_out_bin that do not interpret the characters they manipulate (in particular the ^Z character under DOS, that means end of text file). Otherwise the first occurrence of a ^Z character indicates to the system the end of file is reached, before the real end of the file. Hence the ``truncated object'' error message produced by input_value. See How to save and restore the graphical screen ? for a complete example.

How to recompile a bunch of Caml files ?

Caml modules allow separate compilation. If available on your system, you should use the make utility to recompile the modules of your programs. You may use our generic makefile for Caml Light or for Objective Caml.

We also describe here a simple makefile for a small Caml Light program: only two modules module1 and module2, and we generate a command named prog):

CAMLC=camlc -W -g -I .

OBJS= module1.zo module2.zo
EXEC= prog

all: $(OBJS)
	$(CAMLC) -o $(EXEC) $(OBJS)

clean:
	rm -f *.z[io] *.zix *~ #*#

depend:
	mv Makefile Makefile.bak
	(sed -n -e '1,/^### DO NOT DELETE THIS LINE/p' Makefile.bak; \
	 camldep *.mli *.ml) > Makefile
	rm Makefile.bak

.SUFFIXES:
.SUFFIXES: .ml .mli .zo .zi

.mli.zi:
	$(CAMLC) -c $<
.ml.zo:
	$(CAMLC) -c $<

### EVERYTHING THAT GOES BEYOND THIS COMMENT IS GENERATED
### DO NOT DELETE THIS LINE

Dependencies between modules are automatically recomputed by launching make depend. The camldep command can be the following perl script:

#!/usr/local/bin/perl

# To scan a Caml Light source file, find all references to external modules
# (#open "foo";; or foo__bar), and output the dependencies on standard output.
#
# Usage:    camldep [-I path] <file> ...

while ($#ARGV >= 0) {
  $_ = shift(@ARGV);
  if (/^-I(.*)$/) {
    $dir = $1 ? $1 : shift(@ARGV);
    $dir =~ s|/$||;
    unshift(@path, $dir);
  }
  elsif (/(.*)\.mli$/ || /(.*)\.zi$/) {
    do scan_source ($_, "$1.zi");
  }
  elsif (/(.*)\.ml$/ || /(.*)\.zo$/) {
    do scan_source ($_, "$1.zo");
  }
  else {
    die "Don't know what to do with $_";
  }
}

sub scan_source {
  local ($source_name, $target_name) = @_;
  $modname = $target_name;
  $modname =~ s|^.*/||;
  $modname =~ s|\.z[io]$||;
  undef(%imports);
  open(SRC, $source_name) || return;
  while(<SRC>) {
    if (m/#\s*open\s*"([^"]*)"/) {
      $imports{$1} = 1;
    }
    while (m/([a-zA-Z0-9_]+)__/g) {
      $imports{$1} = 1;
    }
  }
  close(SRC);
  undef(@deps);
  if ($target_name =~ m/\.zo$/ && -r ($source_name . "i")) {
    push(@deps, "$1.zi");
  }
  foreach $modl (keys(%imports)) {
    next if ($modl eq $modname);
    if ($dep = do find_path ("$modl.mli")) {
      $dep =~ s/\.mli$/.zi/;
      push(@deps, $dep);
    }
    elsif ($dep = do find_path ("$modl.ml")) {
      $dep =~ s/\.ml$/.zo/;
      push(@deps, $dep);
    }
  }
  if ($#deps >= 0) {
    print "$target_name: ";
    $col = length($target_name) + 2;
    foreach $dep (@deps) {
      $col += length($dep) + 1;
      if ($col >= 77) {
        print "\\\n    ";
        $col = length($dep) + 5;
      }
      print $dep, " ";
    }
    print "\n";
  }
}

sub find_path {
  local ($filename) = @_;
  if (-r $filename) {
    return $filename;
  }
  foreach $dir (@path) {
    if (-r "$dir/$filename") {
      return "$dir/$filename";
    }
  }
  return 0;
}

Is there an interface between Caml and LaTeX ?

An interface between Caml and LaTeX is distributed in the contrib/caml-tex directory, as a command caml-tex. This is a filter that extracts Caml phrases embedded in its LaTeX file argument, evaluates them, and insert the outcome of the evaluation into its LaTeX output file. (The filter is written in Perl, so you'll need Perl version 4 installed on your machine.)

Is there an interface between Caml and Lex ?

Yes. See the reference manual corresponding section.

Is there an interface between Caml and Yacc ?

Yes. See the reference manual corresponding section.

Efficiency

Cost of basic operations ?

It is difficult to figure out a reliable idea of the cost of basic operations, since this cost not only depends on the machine, but also on the Caml compiler: an optimizing native compiler does not behave the same as a byte-code compiler for a virtual machine. Not only absolute costs of basic operations are different but the relative costs of operations may vary of several orders of magnitude. Even with the same Caml system and the same compiler, the relative costs greatly vary with the underlying hardware.

The situation is even worth with modern architectures since processors are now capable to run several operations at the same time (in parallel), so that some instructions turn out to be ``free'', in the sense that their cost is zero, when they are executed during the execution of another instruction, either because the processor had the opportunity to launch several operations in parallel, or because the processor is waiting for the completion of a complex operation that is independent of the result of the ``free'' instruction. So, the time necessary to execute a given instruction varies with the moment when the instruction is ready to be executed. In this case, it is clear that it is almost hopeless to assign a fixed cost value to an assembly instructions, let alone to basic operations of a high level language such as Caml.

However, some basic principles are invariant by those hardware variations and time hazards in the scheduling of the instructions flow of a program. For instance, the integer addition is almost consistently the fastest operation available on the computer (modern computers may execute several times a 1 000 000 of additions by second).

Thus, I give a rough indication of the cost of basic operations in terms of conventional units, those units being the cost of elementary operations (for instance integer addition). For some basic operations the execution time depends on the size of arguments, so the corresponding complexity is mentioned. In any cases the proposed costs are just plausible order of magnitude, given as mere approximations: don't use those figures to compute the expected run times of your programs!

Rough ideas

Using a byte-code compiler : all operations have almost the same constant cost, the one of an integer addition, since there is a constant time passed to decode the bytes of the program.

Complexity indications given for operations that have a non constant complexity are independent of the compiler.

Using an optimizing compiler : the operation costs are more closely related to the hardware capabilities. That's the hypothesis leading to relative execution times given below.

In short :
Note that a function call is really fast in Caml: it often costs two times less than an integer multiplication.
In contrast, string and vector concatenation (^) and concat_vect or Array.concat) or list concatenation (@) have not a constant cost, since it is proportional to the length of their result.

Equality tests on integers and floating point numbers have a constant and very low cost, and the same holds for the physical equality ==. In contrast, the generic polymorphic equality gets an a priori unbound cost, for instance linear in terms of the length of the arguments when comparing strings.

Units

We use the following conventional and formal units:

add : the cost of an integer addition.
mult : the cost of an integer multiplication.
cell : the cost to build a list cell (including the time needed to re-clame the allocated space if necessary).
read : the cost to read a data in memory.
write : the cost to write a data in memory.
call : the cost of a unary function call (only one argument).

Relative costs of measurement units

The relative value of those units depends on the hardware and on the Caml compiler.

With a byte-code compiler : you can consider that all units are equivalent.

With a native code compiler : the relative costs of the units, carefully measured on several machines accessible from my workstation are approximately as follows:

add : 1 by definition.
mult : 2 up to 10 add.
cell : average is about 100 add.
40 add only in the best cases (small and short lived lists). Up to 1400 add in the worst case (lists having several thousands elements, such that the total size exceeds the minor zone of the garbage collector).
read : highly depends on the locality of the accessed data. Best case is 3 add (the data is entirely in the primary cache of the processor), but in general we need 20 add (the data is in the (secondary or more) cache memory), or 70 add (the data in the main memory), and up to 10^6 add in the worst case (the data has to be read back from some auxiliary memory (from the hard drive for instance)).
write : also depends on the locality of the data that we are assigning, and also on the reads of the data written that would be performed afterwards. Very often faster than a read, since the processor does not perform a complete write: the write order is just launched and the read backs are performed before the write is completed. Best case: 1 add (integer vector entirely contained into the primary cache), in general 20 add (floating point numbers or pointers vector), up to 60 add in the worst case.
call : best case (the function is statically known and monomorphic) 6 add. Worst case (unknown and polymorphic function) : 14 add.

Costs for Caml operations

Integer operations:
- Addition, subtraction, shifts (lsl, lsr, asr) : 1 add.
- Multiplication : 1 mult.
- Division, quotient and remainder : 5 to 10 mult.
- Conversion: from a string, from 10 to 100 mult depending on the length of the string; conversion to a string from 20 to 300 mult depending on the length of the resulting string.
- Printing : from 30 to 500 mult according to the number.
Floating point numbers operations:
- Addition, subtraction : 1 mult.
- Multiplication : 1 to 2 mult.
- Division, integer quotient and remainder : 2 to 10 mult, according to the operands.
- Transcendental functions : 4 to 20 mult.
Operations on lists:
- Creation of a list cell : 1 cell.
- Access to the first element : 1 read.
- Computing the length of a list : proportional to the number of elements (1 read by element).
- Concatenation of lists : proportional to the number of elements of the first list (if l1 is the length of the first list, the cost is approximately l1 * (1 cell + 1 read + 1 call)).
Operations on tuples:
Comparable to list cells (creation of a pair corresponds to the creation of a list cell, access to components corresponds to the access to the first element of a list).
Operations on strings:
The costs are comparable to the costs for vectors.
- Creation :
  - With no initialization, using create_string : constant cost approximately 1 cell.
  - With initialization, using make_string : the cost is proportional to the length of the resulting string (cost with no initialization + 1 write per character).
  - By citation (constant string written in a program text) : the worst case is equivalent to the creation using make_string. In the best case, the cost is almost null (static allocation during compilation).
- Access to elements :
  - with bound checks : 2 reads.
  - without bound checks : 1 read.
- Write an element :
  - with bound checks : 1 read + 1 write.
  - without bound checks: 1 write.
- Computing the length : independent of the number of elements of the string. Cost : 2 reads.
- Concatenation : proportional to the sum of the lengths of the arguments (for strings of length l1 and l2 the complexity is approximately ``Creation of a string of length (l1 +. l2)'' + (1 read + 1 write) *. (l1 +. l2)).
- Sub-string : proportional to the length of the result (Creation + 1 read and 1 write per character of the result).
Operations on vectors:
- Creation : proportional to the length of the vector (approximately 1 write per element).
- Access to elements :
  - with bound checks : 2 reads.
  - without bound checks : 1 read.
- Write an element :
  - with bound checks : 1 read + 1 write.
  - without bound checks: 1 write.
- Computing the length : 1 read. (In fact, very low constant cost (corresponding to 1 read in the cache memory, hence 2 add).)
- Concatenation : proportional to the sum of the lengths of the concatenated vectors (for vectors of length l1 and l2 the complexity is approximately (2 writes + 1 read) *. (l1 +. l2)).
- Sub-vector : proportional to the length of the resulting vector.
Let binding:
In the best case, the cost is 0 (optimizing compiler, many registers in the processor); in the worst case, the time is equivalent to the creation of a list cell for each defined identifier.
Function call:
For a unary function, it is the definition of 1 call. Add 1 add for each extra argument.
Creation of a function:
- In the best case (optimizing compiler, function with no free variables, always applied, never used as a value): 0 add.
- In the worst case (function used as a value or having free variables) : at least 1 cell (creation of a closure with maximally shared environment), but for many compilers the time is equivalent to cell *. (n + 1) where n is the number of identifiers free in the body of the function (building of a flat closure).
Test et branch:
tests on integers or using the physical equality ==: 1 add. For floating point numbers monomorphic test eq_float : 3 add. For strings tests are proportional to the length. For other values test with generic equality depends on the values to test: it needs a complete visit of the structure (and the test can loop in some cases). For instance, the equality test for two lists needs the complete visit of all the elements in the worst case (lists that are equal).
Exception handling:
- Raise in the best case: 1 call. No more than two calls to unknown functions.
- Try in the best case: 2 writes in the stack. No more than one call to unknown function.

Loops or recursive functions ?

A recursive function may easily encode a loop. For instance the loop for i = beginning to the_end do body done is equivalent to

let rec for_loop i =
 if i <= the_end then begin body; for_loop (i + 1) end in
for_loop beginning

The choice between those two forms is a matter of style: the second form can be useful when the loop has to return a result. Otherwise the for loop is clearer: it is more concise and bounds of the loop are clearly apparent. However,

let rec for_loop i =
 if i <= the_end then
   if v.(i) = 0 then i else for_loop (i + 1)
 else -1 in
for_loop beginning

may be considered clearer than an implementation with a loop that raises an exception:

exception Exit of int;;
try
 for i = beginning to the_end do
  if v.(i) = 0 then Exit i
 done;
 -1
with Exit i -> i

On the other hand, the compilers may have difficulties to recognize a loop in the encoding via a recursive function. As a consequence, this version may be slightly less efficient. Choose the simpler way to write your programs!

How to write functions with more than one argument ?

There is no syntactic way to define functions with multiple arguments in Caml. Thus, the user has to encode them either via currying or via tupled arguments:

let f (x, y) = x + y;;

let f x y = x + y;;

From the expressiveness point of view, the curried encoding leads to functions that can be partially evaluated, whereas tuples must be provided as a single argument.

For efficiency, Caml compilers (generally) favor currying over tupling.

Let me explain why the naive compilation of these two encodings is not efficient. Suppose that f is defined as let f x y = body, or equivalently as

   let f = function x -> function y -> body.

As is explicit on this expanded form, f can be partially applied: if e1 evaluates to v1, then the expression f e1 evaluates to a closure with code part (function y -> body) and environment part (x, v1). Thus the naive compilation of f e1 e2 (or equivalently (f e1) e2) starts by calling f with argument v1, then f builds and returns the intermediate closure, and this closure is then applied to (the value of) e2. This is inefficient, since we perform 2 function calls and waste memory to build this intermediate closure which is immediately applied and becomes garbage.

Now suppose f is defined as let f (x, y) = body, the naive compilation of f (e1, e2) is:

evaluate e1 and e2 leading to v1 and v2
build the tuple (e1, e2)
call f with this tuple as argument.

Here we perform only one function call, but once again we build an intermediate data structure to call the function, and this data structure becomes garbage as soon as entering the function.

A clever way of compiling one of this two forms is necessary, since functions with multiple arguments are heavily used. This is a bit difficult but the idea is mainly the following:

for curried functions, call directly their body when they are applied with all their arguments (complete applications).
for functions with tuples, call directly their body without constructing the tuple (these functions are always completely applied).

Since curried functions are a bit more powerful, since they can be partially applied, we may suppose that users will adopt this way of encoding multiple arguments. Thus, generally speaking, Caml compilers optimize the curried form of encoding. These optimizations are not always as efficient as the scheme described above, but compilation of currification is definitely better than tupling for Caml Light, Objective Caml, Bigloo and Camlot. Caml compilers generally do not optimize tupling.

let f x = let rec f_rec ... or let rec f x ?

Which of the following is more efficient ?

   let map f =
     let rec mapf l =
       match l with
       | [] -> []
       | x :: t -> let y = f x in y :: mapf t in
     mapf;;

   let rec map f l =
     match l with
     | [] -> []
     | x :: t -> let y = f x in y :: map f t;;

The answer is: this is compiler dependent!

The first definition is a strange view of map as a (non-recursive) function that computes a looping function. The second definition is more natural and properly exposes that map has two arguments. If there are no efficiency reasons to prefer the first form, I definitely prefer the second ``natural'' definition.

To discuss efficiency, you must be aware of the encoding and compilation of n-ary functions in Caml (see details).

The naive form:

   let rec map f l =
     match l with
     | [] -> []
     | x :: t -> let y = f x in y :: (map f t);;

exposes that map has two arguments. Thus a compiler that optimizes complete application of curried functions will efficiently compile the recursive call. On the other hand, a naive compiler will build the partial intermediate closure for each recursive call.

On the other hand, with the first encoding:

   let map f =
     let rec mapf l =
       match l with
       | [] -> []
       | x :: t -> let y = f x in y :: (mapf t) in
     mapf;;

the optimizing compiler may fail to recognize that map may be optimized as a curried functions with two arguments: it will instead compute a closure for mapf, and generally speaking a closure is less efficiently applied.

An interesting case: the Bigloo compiler performs a so-called ``eta'' optimization that automatically rewrites this code to (an equivalent of) the first natural one!

The naive compiler will compile naively as usual, but now the code effectively require to compute the intermediate closure only once, when partially applying map to f. Hence recursive calls to mapf will save this extra closure construction, and the second form will be a bit more efficient.

In conclusion: prefer the natural encoding, it is clearer and compilers are designed to optimize natural code.

Why isn't my constant statically allocated ?

Consider the code

let gamma x =
    let coeffs = [ 0; 1; 2 ] in
    ...

The list coeffs is a constant for any call to gamma. The compiler can thus allocate the list once and for all (``statically'').
However, the intuitive semantics of Caml leads to allocate a new list at each call to gamma (dynamically): the definition of coeffs is simply re-executed each time the evaluation enters the body of gamma. To ensure that the definition of coeffs is executed only once, during the definition of gamma, you should have written:

let gamma =
    let coeffs = [ 0; 1; 2 ] in
    (function x -> ...

However, if the constant data written in the program is not mutable, there is no harm to share it, and to define it statically. Most Caml compilers will optimize this case. Anyway, the simplest way to ensure the sharing is to define the constant outside the function that uses it:

let coeffs = [| 0; 1; 2 |];;
let gamma x =
    ...

Why isn't my constant array statically allocated ?

This is a variant of the preceding problem, now in the case of mutable data structures.

let gamma x =
    let coeffs = [| 0; 1; 2 |] in
    ...

In this case, the vector coeffs cannot be statically allocated in general, since it cannot safely be shared (its different instances may be returned as result or used to build other data in the body of gamma).

However, if coeffs is not used as a mutable data structure but instead as a list with an optimized access to the elements (in particular this means that the vector is never modified nor returned as a result), then it is possible to share it. As you may guess this analysis is complex, and is probably not included in your compiler. So, you have to let it explicit, by defining the constant outside of the function that uses it:

let coeffs = [| 0; 1; 2 |];;
let gamma x =
    ...

Obsessed by speed

Subject: Re: [Caml-list] Obsessed by speed
From: Xavier Leroy
Date: Sun, 27 May 2001 10:47:06 +0200
In-Reply-To: <msg00145.html>; from Mattias Waldau on Fri, May 18, 2001 at 09:51:07PM +0200

> 1. If I define Array.iter + a function that uses it,
>    will ocamlopt combine these two functions two one?
>    (I looked in the assembler code, but it seemed as
>     ocamlopt didn't combine them)

You're right, the function inlining pass in ocamlopt is rather
conservative and doesn't inline and beta-reduce function arguments to
higher-order functions.  Inlining is a delicate trade-off between too
little and too much (e.g. code size and compile time explosion), and
OCaml errs on the conservative side.

> 2. Do we need special Pentium-4 adaptions to utilize
>    its very good float performance?

I'm not familiar with the P4 micro-architecture, so I don't know.
ocamlopt uses the standard IA32 stack-based model for floating-point
code.  Apparently, the P4 can now do 64-bit float arithmetic between
true registers (the SSE2 model), and ocamlopt could generate better
(and simpler!) floating-point code for this model.  Don't know how
much of a performance difference that would make, though.  

At any rate, the ocamlopt port for AMD's x86-64 architecture will use
the SSE2 model.

> 3. Would ocamlopt benefit from a peephole optimizer of
>    the assembler code? Or is the assembler code already
>    optimal?

No assembly code is ever optimal, especially if machine-generated :-)
A peephole optimizer could remove some cosmetic inefficiencies, but I
doubt this would make a significant speed difference.  Today's
processors have enough instruction-level parallelism and dynamic
instruction scheduling that a few redundant integer operations here
and there don't really hurt.  

Other higher-level optimizations not currently performed could improve
performance more, e.g. common subexpression elimination on memory loads.

> 4. What is unboxed and what isn't?
>    I have noticed that there is a
>    continuos work to unbox more.

Very briefly:

Always unboxed:
  int, char, bool, constant constructors of datatypes.
Locally unboxed (in expressions and let...in): 
  float, int32, nativeint (and int64 on 64-bit processors)
Unboxed inside arrays:
  float
Unboxed inside big arrays:
  all numerical types
Always boxed:
  everything else (records, non-constant datatype constructors,
  tuples, arrays, etc)

> 5. How do you make sense of gprof-output? Any suggestions?

The "info" docs for gprof contain a tutorial.
The function names that appear are of the form Modulename_functionname_NNN
where NNN is a unique integer.
Be careful with the call graph reported by gprof: it is totally
inaccurate for higher-order functions.

How to inline assembly code in programs ?

You can interface your Caml program with any C program of your own. In this C program, you can insert assembly code if your C compiler does support those pragmas.

On the other hand, there is no direct mean to insert assembly code within Caml source code; the language is far too high level to allow easy support of this feature at the programmer level (memory manager's invariants are numerous and complex and they must be imperatively enforced by any piece of code inserted into Caml code); last but not least, we want to keep the perfect portability of the language and it seems clear that assembly code insertions would jeopardize portability.

Contact the author Pierre.Weis@inria.fr