Frequently asked Questions about Caml

Contact the author Pierre.Weis@inria.fr

Created in October 1995.

Table of contents:

* General Questions

* Syntax

* Semantics

General Questions

What is Caml ?

Caml is a programming language. It is a functional language, since the basic units of programs are functions. It is a strongly-typed language; it means that the objects that you use belong to a set that has a name, called its type. In Caml, types are managed by the computer, the user has nothing to do about types (types are synthesized).

The language is available on almost every Unix platform, on PCs (under Windows) and on the Macintosh.
A brief tour on main features of Caml.
More details on how to obtain Caml.

What is the meaning of the name ``Caml'' ?

``Caml'' is an acronym: it stands for ``Categorical Abstract Machine Language''. The ``Categorical Abstract Machine'' is an abstract machine to define and execute functions. It is issued from theoretical considerations on the relationship between category theory and lambda-calculus. The first Caml compiler produced code for this abstract machine (in 1984).

In addition, Caml is issued from the ML programming language, designed by Robin Milner in 1978, and used as the programming language to write the ``proof tactics'' in the LCF proof system.

Do you write ``Caml'' or ``CAML'' ?

Hey, pretty simple to answer the question: just look at the contents of this file, Caml is written once or twice a line! On the other hand CAML is written only in the question ``Do you write Caml or CAML ?'' So, guess what ? We write Caml! Here is the detailed explanation:

According to usual rules for abbreviations (more exactly acronyms, since Caml stands for Categorical Abstract Machine Language), we should write CAML, as we write USA or INRIA. On the other hand, this upper case name seems to yell all over the place, and writing Caml is far more pretty and elegant. Now, the new name Objective Caml confirms this choice for simplicity and elegance, since, as far as I know, nobody writes Objective CAML!

So write Caml to be smart and think CAML to advertise this powerful programming tool!

Is Caml a compiled or interpreted language ?

Caml is compiled. However each Caml system (we call ``system'' the package ``compiler+its associated libraries'') offers a top-level interactive loop, that is similar to an interpretor. In fact, in the interactive system, the user may type in pieces of programs (we call these pieces Caml ``phrases'') that the system handles at once, compiling them, executing them, and writing their results.

For interactivity, is it analogous to perl?

That is to say, do I have to type the phrase I want to test as an extra argument on the command line ?

No. You launch the interactive system, then interact with it. For instance, if you use the Caml Light system under Unix, you may type

$ camllight

to launch the interactive system. Then you type the Caml phrases that you want to test. For instance:

$ camllight
>       Caml Light version 0.74

#1 + 2;;
- : int = 3
#

How to stop the Caml system ?

It is often possible to interrupt a program or the Caml system by typing some combination of keys that is operating system dependent: under Unix send an interrupt signal (generally Control-C), under Macintosh OS type Command-., under Windows use the Caml menu.

How to quit the interactive system ?

Type:

quit();;

or send an end-of-file (CTRL-D for Unix, CTRL-Z for DOS, ...)

To end a Caml phrase, is it mandatory to type 2 semi-colons ?

Yes indeed, since ``;;'' is the end-of-phrase mark. In addition, you also need a carriage return (or enter) key press.

How to compile a program ?

Write the program in a file (whose name must end with the extension ".ml") then call the Caml batch compiler to compile this file. In this case, printing of results is no more handled by Caml as when using the interactive system: you need to explicitly print your results.

For instance: if the file toto.ml contains

let x = 2;;

print_int (x * x);;

exit 0;;

We compile it (under Unix) by the command

$camlc toto.ml

that creates an executable compiled program. We can launch this program using the operating system (by default the generated executable program is named a.out under Unix):

$ a.out
4$

Phrases contained into the file have been executed in the order of presentation in the file.

What are the differences between Caml V3.1, Caml Light, and Objective Caml ?

These are different Caml systems: different compilers and different libraries. Moreover all these systems have their own extensions to the Caml core language.

These systems share many features since they all implement the core of the Caml language. So the basic syntax is the same. On the other hand the module systems are all different: the simpler system is Caml Light's, the more sophisticated (and the more powerful) is Objective Caml's. The compilers are often based on incomparable technology (native code or byte-code), some systems may offer several compilers. Some systems may run on more hardware platforms than others.

From the time being, the Caml Light system is the more widely used. It runs on virtually every Unix platforms, and personal computers (PC and Macintosh).

The Objective Caml system is the more advanced. It offers two tightly coupled compilers: a compiler to native code (optimizing the runtime but a bit slow to compile, this is ocamlopt) and a byte-code compiler (compiled code is slower but the compiler is very fast, this is ocamlc).

Is it possible to get error message in my own language ?

You can choose the language that Caml Light uses to write its messages.

To fix the language tell it to Caml Light (the name of languages is the one used by Internet).

Language now available are:

English is the default language for messages that cannot be translated. If your language is not yet available, and if you want to translate Caml Light messages (about 50 messages), you're welcome to contact the Caml team (mail to caml-light@inria.fr).

Where can I find the documentation ?

Books: 3 introductory books in French.

Where can I report an error from the Caml system ?

First, please verify that this is really an error from the Caml system, not a documented feature that you are not aware of. In case of a real bug, report it to our team (caml-bugs@inria.fr). Don't forget to tell us the kind of machine and the Caml version you are using. If possible try to circumvent the bug as much as possible, by writing a small example that exhibits the bug. Thank you in advance.

What is the difference between Caml and Caml Light ?

The Caml V3.1 system is the ancestor of Caml Light. Caml V3.1 has many interesting (and rather complex) features, hard to implement. That is why the Caml V3.1 system works only on some Unix machines. In contrast, the Caml Light system is far more simple and portable, and runs on almost every Unix, Macintosh or PC platforms. Moreover, many useful and good features from the Caml V3.1 system are now ported to Caml Light, so that the difference between those two tends to vanish. You may find details here.

How to do graphics in Caml ?

Caml Light and Objective Caml provide a library for graphic commands, that is machine independent.

On Macintosh or PC this library is linked with the application. Under Unix, you must compile and install the camlgraph library which is in the contrib directory. You then launch an interactive Caml system with graphics by the command

$ camllight camlgraph

Graphic commands are then available if you open the graphic module (#open "graphics";;). The graphic window appears when calling open_graph: its argument is a string that describes the geometry of the window (the empty string corresponds to a good default geometry).

So, a program that uses graphics generally starts with the two lines:

#open "graphics";;
open_graph "";;

The size of the screen is implementation dependent, origin of coordinates is to the bottom left of the screen. Abscissas increase from left to right, and ordinates from bottom to up. There is a current point, and a pencil with a size and color. Pen is moved with or without drawing using:

Other operations:

You can print on the graphic window:

You may also use events to interact with the user: The graphic library provides also primitives to manipulate images. I don't describe them here.

How to compute with big numbers in Caml ?

Caml Light provides a library that handles exact arithmetic computation for rational numbers.
This library lies in the contrib directory, and you must compile and install it before use. Then you may open it in your programs or launch an interactive system including the package by the command (under Unix)

$ camllight camlnum

Operations on big numbers gets the suffix /: addition is thus +/. You build big numbers using conversion from (small) integers or character strings.
I first define a printer for the type num, then I compute 1/3 + 2/3:

#open "num";;
#open "format";;
#let print_num n = print_string (string_of_num n);;
print_num : num -> unit = <fun>
#install_printer "print_num";;
- : unit = ()
#num_of_string "2/3";;
- : num = 2/3
#let n = num_of_string "1/3" +/ num_of_string "2/3";;
n : num = 1

Now, I define the factorial function:

#let rec fact n =
 if n <= 0 then (num_of_int 1) else num_of_int n */ (fact (n - 1));;
fact : int -> num = <fun>
#fact 100;;
- : num =
 93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000

How to measure elapsed time in Caml ?

The easiest way is to call the sys__time : unit -> float function: it returns the elapsed time since the beginning of the execution.
Alternatively, under Unix, you may use the Unix module that provides an interface to the operating system (unix__time for Caml Light, or Unix.time for Objective Caml). More precise timing informations, but more complex to use for beginners.

How to install Caml on Mac OS X ?

You just need to follow the steps given in the page http://caml.inria.fr/pub/old_caml_site/caml-macosx-howto/index.html.

Syntax

What is the syntax of the language ?

Difficult to answer something else: read the reference manual! However, a good piece of advice is to write your first programs by modifying existing working programs. A first source of good looking programs is the examples given with the language (see the examples/* directories in the distribution).
See also A taste of Caml to get some simple program examples.

To give you a first idea of the Caml syntax, let me comment the smallest example distributed with the Caml Light system, the program examples/basics/fib.ml (I added line numbers to facilitate citations):

1: (* The Fibonacci function, once more. *)

2: let rec fib n =
3:   if n < 2 then 1 else fib(n - 1) + fib(n - 2)
4: ;;

5: if sys__interactive then () else
6: if vect_length sys__command_line <> 2 then begin
7:   print_string "Usage: fib <number>";
8:   print_newline()
9: end else begin
10:   try
11:     print_int(fib(int_of_string sys__command_line.(1)));
12:     print_newline()
13:   with Failure "int_of_string" ->
14:     print_string "Bad integer constant";
15:     print_newline()
16: end
17: ;;

try ... with

The try of line 10, with its with associated expression (line 13 to 15) is the Caml construction to deal with errors: the construction ``try expression with matching'' means execute ``expression'' while catching errors that may occur (we handle the different error cases in the ``pattern matching'' part of the ``try ... with'').

In the preceding example, if the user called the program with an argument that cannot be converted to an integer (for instance if the user launched $fib Hello), then the function int_of_string, called in line 11, fails raising the exception (Failure "int_of_string"). This error is then caught in the ``with'' part of the ``try'' that then prints an appropriate error message (lines 14 and 15).

begin ... end

Keywords begin and end are equivalent to parentheses; they are mainly used as delimiters for sequences of instructions (that is expressions separated by ``;'' as in lines 7 and 8) or the ``match ... with'' ``try ... with'' constructs, as in lines 9 and 16.

raise exception

The raise primitive raises exceptions (signals errors) when it is impossible to continue the computation in a reasonable way. These exceptions have to be caught by a surrounding ``try ... with'' construct to handle the error (either by aborting the whole program after having printed a report, or by continuing the computation in another way). If the error is never caught, then the failure propagates until the evaluation stops. For instance:

#print_string "Hello"; 
#raise (Failure "division par zero");
#print_string " world ";;
HelloUncaught exception: Failure "division par zero"

How to define a function ?

In Caml, the syntax to define functions is close to the mathematical usage: the definition is introduced by the keyword let, followed by the name of the function and its arguments; then the formula that computes the image of the argument is written after an = sign.

#let successor (n) = n + 1;;
successor : int -> int = <fun>

Variations and other kind of functions:

How to define a procedure ?

Recall that procedures are commands that produce an effect (for instance printing something on the terminal or writing some memory location), but have no mathematically meaningful result.

In Caml, there is no special treatment of procedures: they are just considered as special cases of functions that return the special ``meaningless'' value (). For instance, the print_string primitive that prints a character string on the terminal, just returns () as a way of indicating that its job has been properly completed.

Procedures that do not need any meaningful argument, get () as dummy argument. For instance, the print_newline procedure, that outputs a newline on the terminal, gets no meaningful argument: it has type unit -> unit.
Procedures with argument are defined exactly as ordinary functions. For instance:

#let message s = print_string s; print_newline();;
message : string -> unit = <fun>
#message "Hello world!";;
Hello world!
- : unit = ()

Note that it is impossible to define a procedure without any argument at all: its definition would imply to execute it, and there would be no way to call it afterwards. In the following fragment double_newline is bound to (), and its further evaluation never produces carriage returns as may be erroneously expected by the user.

#let double_newline = print_newline(); print_newline();;


double_newline : unit = ()
#double_newline;;
- : unit = ()

The correct definition and usage of this procedure is:

#let double_newline () = print_newline(); print_newline();;
double_newline : unit -> unit = <fun>
#double_newline;;
- : unit -> unit = <fun>
#double_newline();;


- : unit = ()

How to define a function with more than one argument ?

Just write the list of successive arguments when defining the function. For instance:

#let sum x y = x + y;;
sum : int -> int -> int = <fun>

then gives the actual arguments in the same order when applying the function:

#sum 1 2;;
- : int = 3

These functions are named ``curried'' functions, as opposed to functions with tuples as argument.

How to manipulate pairs or tuples ?

In Caml, tuples get the same syntax as in mathematics: a list of comma-separated expressions, enclosed in parentheses.

#(1, 2);;
- : int * int = 1, 2

How to access to elements of pairs and tuples ?

Generally component of tuples are accessed via pattern matching, using the definition of several identifiers.

#let (x, y) = let z = 1 in ((z + 1), (z + 2));;
x : int = 2
y : int = 3

How to define a function that uses tuples ?

Functions may have arguments that are tuples, and functions may return results that are tuples.

#let add (x, y) = x + y;;
add : int * int -> int = <fun>
#add (1, 2);;
- : int = 3
#let div_mod x y = (x quo y, x mod y);;
div_mod : int -> int -> int * int = <fun>
#div_mod 15 7;;
- : int * int = 2, 1

(Note: to encode functions with several arguments, the habit is to use curried functions, instead of functions with tuples as arguments: use let f x y = ..., and not let f (x, y) = ....)

How to define an anonymous function ?

You may use functions that have no names: we call them functional values or anonymous functions. A functional value is introduced by the keyword function, followed by its argument, then an arrow -> and the function body. For instance

#function x -> x + 1;;
- : int -> int = fun
#(function x -> x + 1) 2;;
- : int = 3

How to apply a function ?

Functions are applied as in mathematics: write the function's name, followed by its argument enclosed in parens: f (x). In practice, parens are mandatory only in the case of a complex argument. They are omitted in case of constants or identifiers: we write fib 2 instead of fib (2), and fact x instead of fact (x).

Difficulty: parens cannot be omitted in case of complex arguments.

How to apply a function to a negative number ?

When the argument of a function is more complex than just an identifier, you must enclose this argument between parentheses.
In particular you need parens when the argument is a negative constant number.

To apply f to -1 you must write f (-1)

and not f -1 that is syntactically similar to x - 1 (hence it is a subtraction, not an application).

How to apply a function from within an operation ?

It is not necessary, but may be more readable to add parens when a function is called from within a binary operation. Hence, you write fact x + 1 to mean (fact x) + 1. ((fact x) + 1 is hopefully syntactically correct.)

My program is looping, I don't know why ?

If the argument of a function is an operation, you must add parens:

to apply f to x + 1, you must write f (x + 1)

and not f x + 1 that stands for an addition, and means (f x) + 1. In many cases, when you forget parens, the type-checker finds an error in the program. Unfortunately, this is not always the case:

let rec fact x =
 if x = 0 then 1 else x * fact x - 1;;
x * fact x - 1 means x * (fact x) - 1 and not x * fact (x - 1). As a consequence this program runs into an infinite loop.

What is the difference between fun and function ?

Functions are usually introduced by the keyword function. Each parameter is introduced by its own function construct. For instance, the construct

function x -> function y -> ...

defines a function with two parameters x and y. Functions that use pattern-matching are also introduced by the keyword function.

The keyword fun introduces curried functions (with several successive arguments). For instance

fun x y -> ...
introduces a function with two parameters x et y equivalent to
function x -> function y -> ...

Difficulty: when fun introduces a pattern matching, patterns with constructors that have an argument (``functional'' constructors) must be enclosed by parens. This is because if C is a constructor with one argument, then

fun C x -> ...

syntactically means a function with two arguments ``C'' and ``x''. You need to use parens to resolve the ambiguity:

#type counter = Counter of int;;
Type counter defined.
#let f = fun Counter c -> c + 1;;
Toplevel input:
>let f = fun Counter c -> c + 1;;
>            ^^^^^^^
The constructor Counter requires an argument.
#let f = fun (Counter c) -> c + 1;;
f : counter -> int = <fun>

Note that with the keyword function, this problem vanishes.

#let f = function Counter c -> c + 1;;
f : counter -> int = <fun>

Semantics

How to define a recursive function ?

You need to explicitly tell that you want to define a recursive function: use ``let rec'' instead of ``let''. For example, or, or encore.

My program does not select the right pattern matching case ?

There might be no bug in the Caml compiler! Most of the time, this is due to the fact that you have not enclosed between parentheses a pattern matching which is nested inside another pattern matching.

How to do nested pattern matching ?

You imperatively need to enclose between parens a pattern matching which is written inside another pattern matching. In effect, the internal pattern matching ``catches'' all the pattern matching clauses that are written after it. For instance:

let f = function
 | 0 -> match ... with | a -> ... | b -> ...
 | 1 -> ...
 | 2 -> ...;;
is parsed as
let f = function
 | 0 -> 
     match ... with
     | a -> ...
     | b -> ...
     | 1 -> ...
     | 2 -> ...;;

This error may occur for every syntactic construct that involves pattern matching: ``function'', ``match .. with'' and ``try ... with''. The usual trick is to enclose inner pattern matchings with begin and end. One write:

let f = function
 | 0 ->
     begin match ... with
     | a -> ...
     | b -> ...
     end
 | 1 -> ...
 | 2 -> ...;;

My function is never applied ?

This is due to a missing argument: since Caml is a functional programming language, there is no error when you evaluate a function with missing arguments: in this case, a functional value is returned, but the function is evidently not applied. Example: if you evaluate print_newline without argument, there is no error, but nothing happens:

#print_newline;;
- : unit -> unit
#print_newline ();;

- : unit = ()

My array is modified, I don't know why ?

This is due to the physical sharing of two arrays that you missed. In Caml there are no implicit array copying. If you give two names to the same array, every modification on one array will be visible to the other:

(* Definition of v *)
#let v = make_vect 3 0;;
v : int vect = [|0; 0; 0|]
(* Array w is physically the same as v *)
#let w = v;;
w : int vect = [|0; 0; 0|]
#w.(0) <- 4;;
- : unit = ()
(* v is modified by the modification of w *)
#v;;
- : int vect = [|4; 0; 0|]

The physical sharing effect also applies to elements stored in vectors: if these elements are indeed vectors, the sharing of these vectors implies that modifying one of these elements modifies the others (see also).

How to define multidimensional arrays ?

The only way is to define an array, whose elements are arrays themselves. (Caml arrays are unidimensional, they modelize mathematical vectors.)

The naive way to define multidimensional arrays is bogus: the result is not right because there is some unexpected physical sharing between the lines of the new array:

#let matrix_2_3 = make_vect 2 (make_vect 3 0);;
matrix_2_3 : int vect vect = [|[|0; 0; 0|]; [|0; 0; 0|]|]
#matrix_2_3.(0).(0) <- 1;;
- : unit = ()
#matrix_2_3;;
- : int vect vect = [|[|1; 0; 0|]; [|1; 0; 0|]|]

In fact, the allocation of a new array has two phases: first, computation of the initial value, and then this value is written in each element of the new array. (That's why the line which is allocated by (make_vect 3 0) is unique and physically shared by all the lines of the array matrix_2_3.)

Solution: use the make_matrix primitive that builds the matrix with all elements equal to the initial value provided. Alternatively, write the program that allocates a new line for each line of your matrix. For instance:

let matrix_n_m n m init =
 let result = make_vect n (make_vect m init) in
 for i = 1 to n - 1 do
  result.(i) <- make_vect m init
 done;
 result;;
matrix_n_m : int -> int -> 'a -> 'a vect vect = <fun>

In the same vein, the copy_vect primitive gives strange results, when applied to matrices: you need to write a function that explicitly copies each line of the matrix at hand:

let copy_matrix m =
 let l = vect_length m in
 if l = 0 then m else
 let result = make_vect l m.(0) in
 for i = 1 to l - 1 do
  let coli = copy_vect m.(i) in
  result.(i) <- coli
 done;
 result;;

What is an abstract data type ?

An abstract data type is a type for which names are forgotten: constructors or labels are not exported by the module that define this type. This is useful to change the implementation of the type, without modifications of the client modules.

For instance, we start the implementation of stacks by a trivial implementation using lists:

type 'a stack == 'a list;;
let new_stack init = ref [];;
let push x s = s := x :: !s;;

Then we may use a more complex data structure with vectors and dynamic reallocation when the stack overflows:

type 'a stack = {mutable Content : 'a vect; mutable Pointer : int};;
let new_stack init =
 {Content = make_vect 100 init; Pointer = 0};;
let push x s =
 s.Content.(s.Pointer) <- x;
 s.Pointer <- s.Pointer + 1;
 if s.Pointer = vect_length (s.Content) then
   begin
    let old = s.Content in
    s.Content <- make_vect (2 * vect_length old)
                           old.(0);
    blit_vect old 0 s.Content 0 (vect_length old)
   end;;

How to print ?

Printing functions for each basic type are named print_``name of the type''. For instance print_int, print_float, print_string.

You may also use the printf primitive from the printf module. The printf function takes a string as first argument (the so-called ``format'' string). In this string the type of arguments to print is indicated by a % symbol followed by the symbolic type of the argument. So ``%s'' means a string argument, and ``%d'' an integer argument:

#printf__printf "The number %s is %d" "one" 1;;
The number one is 1- : unit = ()

In addition, you may try the pretty-printing with boxes and line break hints with the format module (basic documentation about format).

Why some output has disappeared ?

To accelerate input/output operations, characters printed by output functions are not output at once on the terminal (or on the file). Instead, these characters are stored into the memory in a buffer. This buffer is automatically flushed when it is full, or when an explicit flush command is evaluated by the program (by calling the function flush : out_channel -> unit), or at the end of evaluation if an explicit termination statement is evaluated (exit 0).

This behavior, common to many programming languages, can lead to the loss of the last printing commands of the program, since the characters are pending into the output buffer, and thus are not written into the output file nor shown on the terminal.

This phenomenon disappeared in case of interactive use, since the interactive system automatically flushes the output buffer at the end of each evaluation phase, before prompting the user for the next phrase. But if you generate executable programs using the Caml compiler you have to explicitly deal with this problem.

To solve the problem it suffices to call the ``end of program'' function exit:

exit 0;;

or to explicitly flush the output buffer of the channel on which the program is writing. For instance, for the terminal, simply evaluate:

flush stdout;;

Why some printing material is mixed up and does not appear in the right order ?

If you use printing functions of the format module, you might not mix printing commands from format with printing commands from the basic I/O system. In effect, the material printed by functions from the format module is delayed (stored into the pretty-printing queue) in order to find out the proper line breaking to perform with the material at hand. By contrast low level output is performed with no more buffering than usual I/O buffering. So that you can observe the following output differences between pure low level output:

# print_string "before";
  print_string "MIDDLE";
  print_string "after";;
beforeMIDDLEafter- : unit = ()
and mixed calls to low and high level printing:
# print_string "before";
  Format.print_string "MIDDLE";
  print_string "after";;
beforeafterMIDDLE- : unit = ()

To avoid this kind of problems you should not mix printing orders from format and basic printing commands; that's the reason why when using functions from the format module, it is considered good programming habit to open format globally in order to completely mask low level printing functions by the high level printing functions provided by format.

N.B: format printings are automatically flushed after each evaluation in the interective system. Explicit flushes are also performed by calling the print_newline function that emits a line break and empties the pretty printer queue.

How to perform input-output ?

Use input-output channels that connect Caml to the rest of the world. You must create a channel before use, so you have to ``open'' the channel, that is connect the channel to a hard device of the computer. When you no longer use the channel, you must close it.

Three channels are always opened:

Other operations:

To speed up execution all the input-output operations evaluated by a program are not immediately performed to the corresponding device: characters to output are simply stored into the memory in a so-called output buffer dedicated to each output channel. When using executable programs (as generated with the Caml compiler and linker), it is mandatory to flush these buffers at the end of evaluation, using the flush function. Otherwise, the last characters written on the corresponding channel could be lost.
The explicit program termination function exit also flushes all pending output buffers. So, just add exit 0;; as the last expression to evaluate at the end of your program, to be sure that output operations will be properly achieved.

How to get random numbers ?

To obtain random numbers, just use predefined functions from the random module included in the standard library. To get an integer use the int function, and to get a floating point number use the float function.

Note that these functions are indeed procedures, since they return a different number at each invocation.
Consider:

#let make_couple () = [| random__int 1000; random__int 1000 |];;
make_couple : unit -> int vect = <fun>
each call to make_couple returns a new fresh vector with a different contents. Compare with
#let make_couple = [| random__int 1000; random__int 1000 |];;
make_couple : int vect = [|281; 407|]
that defines a unique constant vector with two random numbers, that have been chosen once and for all.


Caml home page Last modified: Friday, March 26, 2004
Copyright © 1995 - 2004, INRIA all rights reserved.

Contact the author Pierre.Weis@inria.fr