Caml offers the following control structures:
if ... then ... else
match ... with ...
;
for
and
while
raise ...
and corresponding error handling
try ... with ...
.
If none of these control structures can fit your needs, you must define your own structures using functions (recursive or not).
Table of contents:
if ... then ... else
if ... then ...
if ... then ... else
For tests we classically use an
if condition then expression1 else expression2
construct,
that returns expression1
or expression2
,
according to the condition boolean value.
Boolean arithmetic comparisons are
=
, <
, >
,
<=
, >=
, <>
,
and boolean operators are
&&
, ||
, not
.
For instance:
let absolute_value x = if x >= 0 then x else - x;; absolute_value : int -> int = <fun>
Type checking:
the condition must have type bool
, and both branches
must have the same type (which is the type of the whole construct).
if ... then
A if
construct without the corresponding else
part
is automatically completed with an extra else ()
clause. This kind of construct is useful inside
sequences.
Type checking: as a regular alternative, hence
the condition must have type bool
and both branches
must have the same type (unit
).
Pattern matching or shape analysis is used to perform complex tests involving the shape of the argument applied to the construct.
Syntactically, a pattern matching is a list of
clauses. Each clause is a construct of the form
|
pattern -> expression
. The
pattern matching mechanism is an analysis of the shape of a given value
to compare it with each of the patterns or shapes of the list
of clauses. The analysis succeeds as soon as the value can be
considered as an instance of the pattern of the clause.
A value is an instance of a pattern if it is a special case of this
pattern (in other words, if the pattern is a generalized form of the
value or else the pattern is more general than the value).
A pattern is made of constants (basic constants or constructors),
variables, and applications of functional constructors to argument
patterns. Consider for instance the clauses | 0 -> 1
(integer constant pattern 0
) or | n -> 1
(the pattern is just a single variable n
). The only value
that matches the pattern 0
is obviously the integer
0
, when the n
pattern agrees with every
value whatever it could be. Finally, a pattern such as C
f
where C
is a constructor and f
another pattern, matches all the values built using the constructor
C
and an argument value v
that matches
pattern f
.
Patterns of clauses are tried in turn following the order of the text of the program, until one succeeds.
If none matches the value at hand, an run time error is raised (see for more information).
The value returned by a pattern matching is the one returned by the
corresponding expression
of the clause that succeeded.
The basic syntactic construct to perform pattern matching is
match expression with clauses
: it evaluates
expression
, then matches the resulting value with each
clause of the list of clauses given in the clauses
part
of the matching.
For instance, succ 0
being 1
, the
following pattern matching returns the expression corresponding to the
clause | 1 -> "unit"
:
#match succ 0 with | 0 -> "null" | 1 -> "unit" | _ -> "several";; - : string = "unit"
Note the last clause
| _ -> "several"
, that involves an underscore in the
pattern part: the _
pattern means ``anything else''
and matches any value. It is very often used as the last pattern to
catch all the remaining cases.
Pattern matching a value against a variable has another effect:
during the computation of the expression on the right of the
corresponding clause, the variable evaluates to the value (
pattern matching performs bindings of variables to values).
For instance, succ 1
being 2
, the following
pattern matching returns the expression of the clause | n ->
string_of_int n
, thus string_of_int n
with n
evaluating to 2
(we say that
n
is bound to 2
):
#match succ 1 with | 0 -> "null" | 1 -> "unit" | n -> string_of_int n;; - : string = "2"
Note that the if ... then ... else ...
construct
is just a special case of pattern matching on booleans:
if cond then e1 else e2
corresponds to
match cond with | true -> e1 | _ -> e2
Type checking:
all clauses of a pattern matching must have the same type:
patterns must have the same type and the expressions must have a
common type as well. In the case of the match
e with matching
construct, the type that patterns have in
common must be the type of e
, and the type of all the
expressions of the clauses of the pattern matching is the type of the
whole construct.
Whenever possible, pattern matchings must be exhaustive (every possible case gets a corresponding clause) and must not contain useless clauses (presumably because there are two distinct clauses for the same case). The Caml compiler will probably emit a warning when faced with such a bad pattern matching. Whenever the pattern matching mechanism fails at runtime because the value is not an instance of one of the patterns, a failure occurs indicating where in the source code the pattern matching failed.
For syntactic reasons, a pattern matching that occurs inside another
one must be parenthesized (using parentheses or
begin
and end
keywords).
The reason is clearly evident, if you consider
the pattern matching of the expression e2
, inside the
pattern matching of e1
:
match e1 with | f1 -> match e2 with | f2 -> ... | f3 -> ... | f4 -> e2
As it is written here, one understands that f2
and f3
belongs to the pattern matching of e2
, and this is correct.
But, a human being will probably believe
that the indentation is significant, and will consider that the
pattern f4
belongs
to the surrounding pattern matching, the one of e1
.
Still f4
simply follows f3
, which we agree
belongs to the pattern matching of e2
. That's why,
although the indentation is here error prone, f4
syntactically belongs to the internal pattern matching
(the one of e2
). Presumably, the indented meaning of the
program was:
match e1 with | f1 -> begin match e2 with | f2 -> ... | f3 -> ... end | f4 -> e2
To avoid such errors, you must always use parens when writing pattern matchings inside other pattern matchings.
If you have some condition to test during pattern matching, you may
use a guard, that is a boolean condition that will monitor the
selection of the pattern matching clause.
A guard is introduced by when
, followed by the expression
to test. If this expression evaluates to true
the clause
is selected (when its pattern part matches), otherwise the next
pattern is checked. For instance:
(* power i j computes i ^ j, provided j is positive. *) let rec power i j = match j with | 0 when i != 0 -> 1 | 1 -> i | j when j mod 2 = 1 -> let p = power i (j / 2) in i * p * p | j when i = 0 -> invalid_arg "power" | _ -> let p = power i (j / 2) in p * p;; power : int -> int = <fun>
Another typical example is to test the equality of two variables of the pattern. The following predicate tests if a list has two successive elements that are identical:
let rec two_successive_equal = function | [] -> false | [_] -> false | x :: y :: l when x = y -> true | x :: l -> two_successive_equal l;; two_equal : 'a list -> bool = <fun>
Further reading about guards.
The sequence e1; e2
means successive evaluation of the
two expressions e1
then e2
. The construct
returns the value of e2
. This construct is used to
perform effects, for instance printing effects:
#print_string "Hello "; print_string "World!";; Hello World!- : unit = ()
Type checking: a sequence has the type of its last expression.
In Caml, there is no need to parenthesize a sequence appearing in a
pattern matching clause:
... -> e1; e2
is equivalent to ... -> begin
e1; e2 end
.
if ... then ... else ...
A sequence appearing in a branch of an alternative must be
parenthesized. The common habit is to use begin
and
end
. One writes
if cond then begin e1; e2 end else begin e3; e4 end
On the other hand, an alternative inside a sequence does not need any parens. So
if cond1 then e1; if cond2 then e2 else begin e3; e4 end;;
means:
if cond1 then e1 else (); if cond2 then e2 else begin e3; e4 end;;
If you forgot to parenthesized a sequence corresponding to one of the branch of an alternative, this sequence belongs to the surrounding expression. For instance, in the expression:
if cond1 then e2 else e3; e4;;
the expression e4
is always evaluated, whatever the value
of the cond1
condition could be. That's because the whole
expression is equivalent to:
(if cond1 then e2 else e3); e4;;
There are two kinds of loops, for
and
while
loops. They are used to perform effects and always
evaluate to the value ()
or ``nothing''.
One writes while condition do body done
, to mean
repetitive evaluation of body
, while
condition
evaluates to true. In particular, if
condition
always evaluates to false body
is
never evaluated.
Type checking:
the condition must have type bool
, and a loop always have
type unit
.
One writes for ident = initial to final do body
done
, to mean repetitive evaluation of
body
, with ident
successively bound to
initial
, initial + 1
, ...,
including final
. In particular, if
initial
is strictly greater than final
,
body
is never evaluated.
Note that ident
is defined by the
for
loop, there is no need to declare or define it in advance.
Notice also that you cannot modify the value of ident
when the loop is running (since
ident
is not bound to a mutable
reference).
#for i = 0 to 10 do print_int i; print_char ` ` done;; 0 1 2 3 4 5 6 7 8 9 10 - : unit = ()
Type checking:
loop index has type int
, as well as
expressions initial
and final
.
The loop gets type unit
.
To exit from a while
or for
loop, you
must use the exception mechanism. Escape from the loop by raising the
exception devoted to this case, the Exit
predefined
exception (see below).
The raise
primitive is used to thraw exceptions (signal
errors) when computation cannot go on. These
exceptions ought to be caught by a surrounding ``try ... with''
construct to treat the error (either by aborting the program after
some error message, or going on another way). If there is no handler
the failure propagates until the entire evaluation process finally
aborts. For instance:
#print_string "Hello "; #raise (Failure "division by zero"); #print_string " world!\n";; HelloUncaught exception: Failure "division by zero"
Type checking:
if exception
has type
exn
, then raise exception
has type
'a
.
The try expression with matching
construct
is used to handle errors: expression
is evaluated while
catching errors that may happen during this evaluation.
Errors that may appear during the evaluation are carried out via
exceptions that tell the kind of errors at hand. These errors
may be as numerous as necessary, so that we list them in the ``pattern
matching'' part of the try ... with
: if an exception of
that kind occurs during the evaluation of expression
, it
is matched against the clauses of this pattern matching of the
try ... with
. If the match succeeds, the exception is
stopped (that is, ``the error is recovered''): the expression
corresponding to the matching clause is then returned, and the
evaluation goes on (that is, the program does not abort).
If no clause matches the exception, this exception is propagated (that is, the exception is thrown again, and the program continue abortion).
Type checking:
the type checking rule for try expression with
matching
is analogous to the type checking rule for match
expression with matching
:
the patterns must have the type of exceptional values,
i.e. exn
, and all the expressions of the right hand side
of clauses must have the same type as the expression that the handler
protects, expression
. This is the type of the whole construct.
Using exception handling, you can easily escape from any kind of loops
(for
or while
, or even loops encoded into a
recursive function): this is equivalent to the C break
instruction, or the Ada exit
instruction.
Premature end of loop is obtained by raising the predefined exception
Exit
from within the body of the loop, while handling
this exception from outside the loop.
For instance, we define a predicate that test if a vector contains a zero: if we encounter a 0, we escape from the loop and return true; otherwise the loop goes on until its normal end, and we return false.
let has_zero v = try for i = 0 to vect_length v - 1 do if v.(i) = 0 then raise Exit; done; false with Exit -> true;;
The user defines his own exceptions using
exception name;;
for a constant exception, or
exception name of typ;;
for an exception having an
argument. These exceptions are generative, that is to say:
two successive definitions of an exception with the same name lead to
the definition of two distinct exceptions that will not be confused by
Caml programs (handlers for the first exception will not handle
the second exception).
Type checking:
a constant exception has type
exn
, an exception defined by exception name of typ;;
has type typ -> exn
.
Some exceptions are predefined in Caml to signal some frequent errors or problems:
Failure of string
: report an error that occurred
in some user's code. The reason of the failure or the name of the
function where the error occurred is mentioned in the string argument.
Not_found
: signal that a searching function has
failed (e.g. assoc
).
Exit
: used to jump out of loops or functions.
Break
: interruption (the user has send an explicit
interrupt to the program).
Invalid_argument of string
: the argument provided to
the function cannot be handled properly.
Sys_error of string
: the operating system host
has raised an error (this exception is defined in the sys
library module).
Match_failure of string * int * int
: signal a
pattern matching failure in the file mentioned, at the given location
(in characters).
End_of_file
: an end of file occurred while reading
some data (this exception is defined in the io
library module).
Contact the author Pierre.Weis@inria.fr