Previous Up Next

Chapter 7  Language extensions

This chapter describes language extensions and convenience features that are implemented in OCaml, but not described in the OCaml reference manual.

7.1  Integer literals for types int32, int64 and nativeint

(Introduced in Objective Caml 3.07)

int32-literal::= integer-literal l  
int64-literal::= integer-literal L  
nativeint-literal::= integer-literal n

An integer literal can be followed by one of the letters l, L or n to indicate that this integer has type int32, int64 or nativeint respectively, instead of the default type int for integer literals. The library modules Int32[Int32], Int64[Int64] and Nativeint[Nativeint] provide operations on these integer types.

7.2  Streams and stream parsers

(Removed in Objective Caml 3.03)

The syntax for streams and stream parsers is no longer part of the OCaml language, but available through a Camlp4 syntax extension. See the Camlp4 reference manual for more information. Support for basic operations on streams is still available through the Stream[Stream] module of the standard library. OCaml programs that use the stream parser syntax should be compiled with the -pp camlp4o option to ocamlc and ocamlopt. For interactive use, run ocaml and issue the #load "camlp4o.cma";; command.

7.3  Recursive definitions of values

(Introduced in Objective Caml 1.00)

As mentioned in section 6.7.1, the let rec binding construct, in addition to the definition of recursive functions, also supports a certain class of recursive definitions of non-functional values, such as

let rec name1 = 1 ::  name2 and  name2 = 2 ::  name1 in  expr

which binds name1 to the cyclic list 1::2::1::2::…, and name2 to the cyclic list 2::1::2::1::…Informally, the class of accepted definitions consists of those definitions where the defined names occur only inside function bodies or as argument to a data constructor.

More precisely, consider the expression:

let rec name1 =  expr1 andand  namen =  exprn in  expr

It will be accepted if each one of expr1 …  exprn is statically constructive with respect to name1 …  namen, is not immediately linked to any of name1 …  namen, and is not an array constructor whose arguments have abstract type.

An expression e is said to be statically constructive with respect to the variables name1 …  namen if at least one of the following conditions is true:

An expression e is said to be immediately linked to the variable name in the following cases:

7.4  Range patterns

(Introduced in Objective Caml 1.00)

In patterns, OCaml recognizes the form ' c ' .. ' d ' (two character literals separated by ..) as shorthand for the pattern

' c ' | ' c1 ' | ' c2 ' || ' cn ' | ' d '

where c1, c2, …, cn are the characters that occur between c and d in the ASCII character set. For instance, the pattern ’0’..’9’ matches all characters that are digits.

7.5  Assertion checking

(Introduced in Objective Caml 1.06)

OCaml supports the assert construct to check debugging assertions. The expression assert expr evaluates the expression expr and returns () if expr evaluates to true. Otherwise, the exception Assert_failure is raised with the source file name and the location of expr as arguments. Assertion checking can be turned off with the -noassert compiler option.

As a special case, assert false is reduced to raise (Assert_failure ...), which is polymorphic (and is not turned off by the -noassert option).

7.6  Lazy evaluation

(Introduced in Objective Caml 1.06 for expressions; in Objective Caml 3.11 for patterns)

The expression lazy expr returns a value v of type Lazy.t that encapsulates the computation of expr. The argument expr is not evaluated at this point in the program. Instead, its evaluation will be performed the first time Lazy.force is applied to the value v, returning the actual value of expr. Subsequent applications of Lazy.force to v do not evaluate expr again. Applications of Lazy.force may be implicit through pattern matching.

The pattern lazy pattern matches values v of type Lazy.t, provided pattern matches the result of forcing v with Lazy.force. A successful match of a pattern containing lazy sub-patterns forces the corresponding parts of the value being matched, even those that imply no test such as lazy value-name or lazy _. Matching a value with a pattern-matching where some patterns contain lazy sub-patterns may imply forcing parts of the value, even when the pattern selected in the end has no lazy sub-pattern.

For more information, see the description of module Lazy in the standard library (see Module Lazy).

7.7  Local modules

(Introduced in Objective Caml 2.00)

The expression let module module-name =  module-expr in  expr locally binds the module expression module-expr to the identifier module-name during the evaluation of the expression expr. It then returns the value of expr. For example:

        let remove_duplicates comparison_fun string_list =
          let module StringSet =
            Set.Make(struct type t = string
                            let compare = comparison_fun end) in
            (List.fold_right StringSet.add string_list StringSet.empty)

7.8  Recursive modules

(Introduced in Objective Caml 3.07)

definition::= ...  
  module rec module-name :  module-type =  module-expr   { and module-name:  module-type =  module-expr }  
specification::= ...  
  module rec module-name :  module-type  { and module-name:  module-type }

Recursive module definitions, introduced by the ’module rec’ …’and’ … construction, generalize regular module definitions module module-name =  module-expr and module specifications module module-name :  module-type by allowing the defining module-expr and the module-type to refer recursively to the module identifiers being defined. A typical example of a recursive module definition is:

    module rec A : sig
                     type t = Leaf of string | Node of ASet.t
                     val compare: t -> t -> int
                 = struct
                     type t = Leaf of string | Node of ASet.t
                     let compare t1 t2 =
                       match (t1, t2) with
                         (Leaf s1, Leaf s2) -> s1 s2
                       | (Leaf _, Node _) -> 1
                       | (Node _, Leaf _) -> -1
                       | (Node n1, Node n2) -> n1 n2
        and ASet : Set.S with type elt = A.t
                 = Set.Make(A)

It can be given the following specification:

    module rec A : sig
                     type t = Leaf of string | Node of ASet.t
                     val compare: t -> t -> int
        and ASet : Set.S with type elt = A.t

This is an experimental extension of OCaml: the class of recursive definitions accepted, as well as its dynamic semantics are not final and subject to change in future releases.

Currently, the compiler requires that all dependency cycles between the recursively-defined module identifiers go through at least one “safe” module. A module is “safe” if all value definitions that it contains have function types typexpr1 ->  typexpr2. Evaluation of a recursive module definition proceeds by building initial values for the safe modules involved, binding all (functional) values to fun _ -> raise Undefined_recursive_module. The defining module expressions are then evaluated, and the initial values for the safe modules are replaced by the values thus computed. If a function component of a safe module is applied during this computation (which corresponds to an ill-founded recursive definition), the Undefined_recursive_module exception is raised.

7.9  Private types

Private type declarations in module signatures, of the form type t = private ..., enable libraries to reveal some, but not all aspects of the implementation of a type to clients of the library. In this respect, they strike a middle ground between abstract type declarations, where no information is revealed on the type implementation, and data type definitions and type abbreviations, where all aspects of the type implementation are publicized. Private type declarations come in three flavors: for variant and record types (section 7.9.1), for type abbreviations (section 7.9.2), and for row types (section 7.9.3).

7.9.1  Private variant and record types

(Introduced in Objective Caml 3.07)

type-representation::= ...  
  = private constr-decl  { | constr-decl }  
  = private { field-decl  { ; field-decl } }

Values of a variant or record type declared private can be de-structured normally in pattern-matching or via the expr .  field notation for record accesses. However, values of these types cannot be constructed directly by constructor application or record construction. Moreover, assignment on a mutable field of a private record type is not allowed.

The typical use of private types is in the export signature of a module, to ensure that construction of values of the private type always go through the functions provided by the module, while still allowing pattern-matching outside the defining module. For example:

        module M : sig
                     type t = private A | B of int
                     val a : t
                     val b : int -> t
                 = struct
                     type t = A | B of int
                     let a = A
                     let b n = assert (n > 0); B n

Here, the private declaration ensures that in any value of type M.t, the argument to the B constructor is always a positive integer.

With respect to the variance of their parameters, private types are handled like abstract types. That is, if a private type has parameters, their variance is the one explicitly given by prefixing the parameter by a ‘+’ or a ‘-’, it is invariant otherwise.

7.9.2  Private type abbreviations

(Introduced in Objective Caml 3.11)

type-equation::= ...  
  = private typexpr

Unlike a regular type abbreviation, a private type abbreviation declares a type that is distinct from its implementation type typexpr. However, coercions from the type to typexpr are permitted. Moreover, the compiler “knows” the implementation type and can take advantage of this knowledge to perform type-directed optimizations. For ambiguity reasons, typexpr cannot be an object or polymorphic variant type, but a similar behaviour can be obtained through private row types.

The following example uses a private type abbreviation to define a module of nonnegative integers:

        module N : sig
                     type t = private int
                     val of_int: int -> t
                     val to_int: t -> int
                 = struct
                     type t = int
                     let of_int n = assert (n >= 0); n
                     let to_int n = n

The type N.t is incompatible with int, ensuring that nonnegative integers and regular integers are not confused. However, if x has type N.t, the coercion (x :> int) is legal and returns the underlying integer, just like N.to_int x. Deep coercions are also supported: if l has type N.t list, the coercion (l :> int list) returns the list of underlying integers, like N.to_int l but without copying the list l.

Note that the coercion (expr :> typexpr) is actually an abbreviated form, and will only work in presence of private abbreviations if both the type of expr and typexpr contain no type variables. If this is not the case, you must use the full form (expr : typ_e :> typexpr) where typ_e is the expected type of expr. Concretely, this would be (x : N.t :> int) and (l : N.t list :> int list) for the above examples.

7.9.3  Private row types

(Introduced in Objective Caml 3.09)

type-equation::= ...  
  = private typexpr

Private row types are type abbreviations where part of the structure of the type is left abstract. Concretely typexpr in the above should denote either an object type or a polymorphic variant type, with some possibility of refinement left. If the private declaration is used in an interface, the corresponding implementation may either provide a ground instance, or a refined private type.

   module M : sig type c = private < x : int; .. > val o : c end =
       class c = object method x = 3 method y = 2 end
       let o = new c

This declaration does more than hiding the y method, it also makes the type c incompatible with any other closed object type, meaning that only o will be of type c. In that respect it behaves similarly to private record types. But private row types are more flexible with respect to incremental refinement. This feature can be used in combination with functors.

   module F(X : sig type c = private < x : int; .. > end) =
       let get_x (o : X.c) = o#x
   module G(X : sig type c = private < x : int; y : int; .. > end) =
       include F(X)
       let get_y (o : X.c) = o#y

Polymorphic variant types can be refined in two ways, either to allow the addition of new constructors, or to allow the disparition of declared constructors. The second case corresponds to private variant types (one cannot create a value of the private type), while the first case requires default cases in pattern-matching to handle addition.

   type t = [ `A of int | `B of bool ]
   type u = private [< t > `A ]
   type v = private [> t ]

With type u, it is possible to create values of the form (‘A n), but not (‘B b). With type v, construction is not restricted but pattern-matching must have a default case.

Like for abstract and private types, the variance of type parameters is not inferred, and must be given explicitly.

7.10  Local opens

(Introduced in OCaml 3.12)

expr::= ...  
  let open module-path in  expr  
  module-path .(  expr )  

The expressions let open module-path in  expr and module-path. (expr) are strictly equivalent. They locally open the module referred to by the module path module-path in the scope of the expression expr.

Restricting opening to the scope of a single expression instead of a whole structure allows one to benefit from shorter syntax to refer to components of the opened module, without polluting the global scope. Also, this can make the code easier to read (the open statement is closer to where it is used) and to refactor (because the code fragment is more self-contained).

7.11  Record notations

(Introduced in OCaml 3.12)

pattern::= ...  
  { field  [= pattern]  { ; field  [= pattern] }  [; _ ] }  
expr::= ...  
  { field  [= expr]  { ; field  [= expr] } }  
  { expr with  field  [= expr]  { ; field  [= expr] } }  

In a record pattern or a record construction expression, a single identifier id stands for id =  id, and a qualified identifier path ,  id stands for path .  id =  id. For example, assuming the record type

          type point = { x: float; y: float }

has been declared, the following expressions are equivalent:

          let x = 1 and y = 2 in { x = x; y = y }
          let x = 1 and y = 2 in { x; y }
          let x = 1 and y = 2 in { x = x; y }

Likewise, the following functions are equivalent:

          fun {x = x; y = y} -> x + y
          fun {x; y} -> x + y

Optionally, a record pattern can be terminated by ; _ to convey the fact that not all fields of the record type are listed in the record pattern and that it is intentional. By default, the compiler ignores the ; _ annotation. If the R warning is turned on, however, the compiler will warn if a record pattern fails to list all fields of the corresponding record type and is not terminated by ; _. Continuing the point example above,

          fun {x} -> x + 1

will warn if the R warning is on, while

          fun {x; _} -> x + 1

will not warn. This warning can help spot program points where record patterns may need to be modified after new fields were added to a record type.

7.12  Explicit polymorphic type annotations

(Introduced in OCaml 3.12)

let-binding::= ...  
  value-name :  poly-typexpr =  expr

Polymorphic type annotations in let-definitions behave in a way similar to polymorphic methods: they explicitly require the defined value to be polymorphic, and allow one to use this polymorphism in recursive occurences (when using let rec). Note however that this is just an usual polymorphic type, unifiable with any instance of itself.

There two possible applications of this feature. One is polymorphic recursion:

        type 'a t = Leaf of 'a | Node of ('a * 'a) t
        let rec depth : 'a. 'a t -> 'b = function
            Leaf _ -> 1
          | Node x -> 1 + depth x

Note that ’b is not explicitly polymorphic here, and it will actually be unified with int.

The other application is to ensure that some definition is sufficiently polymorphic.

#let id : ’a. ’a -> ’a = fun x -> x+1 ;;
Error: This definition has type int -> int which is less general than
         ’a. ’a -> ’a

7.13  Explicit naming of type variables

(Introduced in OCaml 3.12)

parameter::= ...  
  ( type typeconstr-name )  

The expression fun ( type typeconstr-name ) ->  expr introduces a type constructor named typeconstr-name which is considered abstract in the scope of the sub-expression, but then replaced by a fresh type variable. Note that contrary to what the syntax could suggest, the expression fun ( type typeconstr-name ) ->  expr itself does not suspend the evaluation of expr as a regular abstraction would. The syntax has been chosen to fit nicely in the context of function declarations, where it is generally used. It is possible to freely mix regular function parameters with pseudo type parameters, as in:

        let f = fun (type t) (foo : t list) -> ...

and even use the alternative syntax for declaring functions:

        let f (type t) (foo : t list) = ...

This construction is useful because the type constructor it introduces can be used in places where a type variable is not allowed. For instance, one can use it to define an exception in a local module within a polymorphic function.

        let f (type t) () =
          let module M = struct exception E of t end in
          (fun x -> M.E x), (function M.E x -> Some x | _ -> None)

Here is another example:

        let sort_uniq (type s) (cmp : s -> s -> int) =
          let module S = Set.Make(struct type t = s let compare = cmp end) in
          fun l ->
            S.elements (List.fold_right S.add l S.empty)

This construction by itself does not make polymorphic the type variable it introduces, but it can be combined with explicit polymorphic annotations where needed. It is also extremely useful for first-class modules.

7.14  First-class modules

(Introduced in OCaml 3.12)

typexpr::= ...  
  (module package-type)  
module-expr::= ...  
  (val expr :  package-type)  
expr::= ...  
  (module module-expr :  package-type)  
package-type::= modtype-path  
  modtype-path with  package-type-constraint  { and package-type-constraint }  
package-type-constraint::= type typeconstr-name =  typexpr  

Modules are typically thought as static components. This extension makes it possible to pack a module as a first-class value, which can later be dynamically unpacked into a module.

The expression (module module-expr :  package-type) converts the module (structure or functor) denoted by module expression module-expr to a value of the core language that encapsulates this module. The type of this core language value is (module package-type).

Conversely, the module expression (val expr :  package-type) evaluates the core language expression expr to a value, which must have type module package-type, and extracts the module that was encapsulated in this value.

The package-type syntactic class appearing in the (module module-expr :  package-type) expressions, the (val expr :  package-type) module expressions and the (module package-type) type expression represents a subset of module types. This subset consists of named module types with optional constraints of a limited form: only non-parametrized types can be specified. For type-checking purposes, package types are compared by path equality on the module type name component, and normal type equality for constraints.

The module expression (val expr :  package-type) cannot be used in the body of a functor, because this can cause unsoundness in conjunction with applicative functors. It can be used anywhere in the context of a local module binding let M = (val expr1 :  package-type) in  expr2, however.

Basic example

A typical use of first-class modules is to select at run-time among several implementations of a signatures. Each implementation is a structure that we can encapsulate as a first-class module, then store in a data structure such as a hash table:

        module type DEVICE = sig ... end
        let devices : (string, module DEVICE) Hashtbl.t = Hashtbl.create 17

        module SVG = struct ... end
        let _ = Hashtbl.add devices "SVG" (module SVG : DEVICE)

        module PDF = struct ... end
        let _ = Hashtbl.add devices "PDF" (module PDF: DEVICE)

We can then select one implementation based on command-line arguments, for instance:

        module Device =
          (val (try Hashtbl.find devices (parse_cmdline())
                with Not_found -> eprintf "Unknown device %s\n"; exit 2)
           : DEVICE)

Alternatively, the selection can be performed within a function:

        let draw_using_device device_name picture =
          let module Device =
            (val (Hashtbl.find_devices device_name) : DEVICE)
            Device.draw picture
Advanced examples

With first-class modules, it is possible to parametrize some code over the implementation of a module without using a functor.

        let sort (type s) set l =
          let module Set = (val set : Set.S with type elt = s) in
          Set.elements (List.fold_right Set.add l Set.empty)

The type inferred for this function is (module Set.S with type elt = ’a) -> ’a list -> ’a list. To use this function, one can wrap the Set.Make functor:

        let make_set (type s) cmp =
          let module S = Set.Make(struct
            type t = s
            let compare = cmp
          end) in
          (module S : Set.S with type elt = s) 

This function has type (’a -> ’a -> int) -> (module Set.S with type elt = ’a).

Another advanced use of first-class module is to encode existential types. In particular, they can be used to simulate generalized algebraic data types (GADT). To demonstrate this, we first define a type of witnesses for type equalities:

        module TypEq : sig
          type ('a, 'b) t
          val apply: ('a, 'b) t -> 'a -> 'b
          val refl: ('a, 'a) t
          val sym: ('a, 'b) t -> ('b, 'a) t
        end = struct
          type ('a, 'b) t = ('a -> 'b) * ('b -> 'a)
          let refl = (fun x -> x), (fun x -> x)
          let apply (f, _) x = f x
          let sym (f, g) = (g, f)

We can then define a paramatrized algebraic data type whose constructors provide some information about the type parameter:

        module rec Typ : sig
          module type PAIR = sig
            type t and t1 and t2
            val eq: (t, t1 * t2) TypEq.t
            val t1: t1 Typ.typ
            val t2: t2 Typ.typ

          type 'a typ =
            | Int of ('a, int) TypEq.t
            | String of ('a, string) TypEq.t
            | Pair of (module PAIR with type t = 'a)
        end = Typ

Values of type ’a typ are supposed to be runtime representations for the type ’a. The constructors Int and String are easy: they directly give a witness of type equality between the parameter ’a and the ground types int (resp. string). The constructor Pair is more complex. One wants to give a witness of type equality between ’a and a type of the form t1 * t2 together with the representations for t1 and t2. However, these two types are unknown. The code above shows how to use first-class modules to simulate existentials.

Here is how to construct values of type ’a typ:

        let int = Typ.Int TypEq.refl

        let str = Typ.String TypEq.refl

        let pair (type s1) (type s2) t1 t2 =
          let module P = struct
            type t = s1 * s2
            type t1 = s1
            type t2 = s2
            let eq = TypEq.refl
            let t1 = t1
            let t2 = t2
          end in
          let pair = (module P : Typ.PAIR with type t = s1 * s2) in
          Typ.Pair pair

And finally, here is an example of a polymorphic function that takes the runtime representation of some type ’a and a value of the same type, then pretty-print the value as a string:

        open Typ
        let rec to_string: 'a. 'a Typ.typ -> 'a -> string =
          fun (type s) t x ->
            match t with
            | Int eq -> string_of_int (TypEq.apply eq x)
            | String eq -> Printf.sprintf "%S" (TypEq.apply eq x)
            | Pair p ->
                let module P = (val p : PAIR with type t = s) in
                let (x1, x2) = TypEq.apply P.eq x in
                Printf.sprintf "(%s,%s)" (to_string P.t1 x1) (to_string P.t2 x2)

Note that this function uses an explicit polymorphic annotation to obtain polymorphic recursion.

7.15  Recovering the type of a module

(Introduced in OCaml 3.12)

module-type::=  ...  
  module type of module-expr

The construction module type of module-expr expands to the module type (signature or functor type) inferred for the module expression module-expr.

A typical use, in conjunction with the signature-level include construct, is to extend the signature of an existing structure, as in the following example.

        module type MYHASH = sig
          include module type of Hashtbl
          val replace: ('a, 'b) t -> 'a -> 'b -> unit

The signature MYHASH, then, contains all the fields of the signature of module Hashtbl, plus the new field replace. An implementation of this signature can be obtained easily, using the include construct at the structure level this time:

        module MyHash : MYHASH = struct
          include Hashtbl
          let replace t k v = remove t k; add t k v

7.16  Substituting inside a signature

(Introduced in OCaml 3.12)

mod-constraint::= ...  
  type [type-parameters]  typeconstr-name :=  [type-parameters]  typeconstr  
  module module-name :=  extended-module-path  

“Destructive” substitution (with ... :=) behaves essentially like normal signature constraints (with ... =), but it additionally removes the redefined type or module from the signature. There are a number of restrictions: one can only remove types and modules at the outermost level (not inside submodules), and the definition must be either another type constructor (with identical type parameters), or a module path.

A natural application of destructive substitution is merging two signatures sharing a type name.

        module type Printable = sig
          type t
          val print : Format.formatter -> t -> unit
        module type Comparable = sig
          type t
          val compare : t -> t -> int
        module type PrintableComparable = sig
          include Printable
          include Comparable with type t := t

One can also use this to completely remove a field:

#module type S = Comparable with type t := int;;
module type S = sig val compare : int -> int -> int end

or to rename one:

#module type S = sig
   type u
   include Comparable with type t := u
module type S = sig type u val compare : u -> u -> int end

Note that you can also remove manifest types, by substituting with the same type.

#module type ComparableInt = Comparable with type t = int ;;
module type ComparableInt = sig type t = int val compare : t -> t -> int end
#module type CompareInt = ComparableInt with type t := int ;;
module type CompareInt = sig val compare : int -> int -> int end

7.17  Explicit overriding in class definitions

(Introduced in OCaml 3.12)

class-field::= ...  
   inherit! class-expr  [as value-name]  
   val! [mutableinst-var-name  [: typexpr=  expr  
   method! [privatemethod-name  {parameter}  [: typexpr=  expr  
   method! [privatemethod-name :  poly-typexpr =  expr  

The keywords inherit!, val! and method! have the same semantics as inherit, val and method, but they additionally require the definition they introduce to be an overriding. Namely, method! requires method-name to be already defined in this class, val! requires inst-var-name to be already defined in this class, and inherit! requires class-expr to override some definitions. If no such overriding occurs, an error is signaled.

As a side-effect, these 3 keywords avoid the warnings “method override” and “ instance variable override”. As of OCaml 3.12, the warning “method override” has to be enabled manually for backwards compatibility reasons.

Previous Up Next