Appendix B Variant types and labeled arguments

In this appendix we briefly present two recent features of the OCaml language and illustrate their use in combination with classes. Actually, they jointly complement objects and classes in an interesting way: first, they provide a good alternative to multiple class constructors, which OCaml does not have; second, variant types are also a lighter-weight alternative to datatype definitions and are particularly appropriate to simulate simple typecases in OCaml. Note that the need for typecases is sufficiently rare, thanks to the expressiveness of OCaml object type-system, that an indirect solution to typecases is quite acceptable.

B.1 Variant types

Variants are tagged unions, like ML datatypes. Thus, they allow values of different types to be mixed together in a collection by tagging them with variant labels; the values may be retrieved from the collection by inspecting their tags using pattern matching.

However, unlike datatypes, variants can be used without a preceding type declaration. Furthermore, while a datatype constructor belong to a unique datatype, a variant constructor may belong to any (open) variant.

Quick overview

Just like sum type constructors, variant tags must be capitalized, but they must also be prefixed by the back-quote character as follows:

let one = `Int 1 and half = `Float 0.5;;

val one : [> `Int of int] = `Int 1 val half : [> `Float of float] = `Float 0.5

Here, variable one is bound to a variant that is an integer value tagged with `Int. The > sign in the type [> `Int of int] means that one can actually be a assigned a super type. That is, values of this type can actually have another tag. However, if they have have tag `Int then they must carry integers. Thus, both one and half have compatible types and can be stored in the same collection:

let collection = [ one; half ];;

val collection : [> `Int of int | `Float of float] list = [`Int 1; `Float 0.5]

Now, the type of collection is a list of values, that can be integers tagged with `Int or floating point values tagged with `Float, or values with another tag.

Values of a heterogeneous collection can be retrieved by pattern matching and then reified to their true type:

let float = function | `Int x -> float_of_int x | `Float x -> x;;

val float : [< `Int of int | `Float of float] -> float = <fun>

let total = List.fold_left (fun x y -> x +. float y) 0. collection ;;

Implementing typecase with variant types

The language ML does not keep types at run time, hence there is no typecase construct to test the types of values at run time. The only solution available is to explicitly tag values with constructors. OCaml data types can be used for that purpose but variant types may be more convenient and more flexible here since their constructors do not have to be declared in advance, and their tagged values have all compatible types.

For instance, we consider one and two dimensional point classes and combine their objects together in a container.

class point1 x = object method getx = x + 0 end;;let p1 = new point1 1;;

To make objects of the two classes compatible, we always tag them. However, we also keep the original object, so as to preserve direct access to the common interface.

let pp1 = p1, `Point1 p1;;

We provide testing and coercion functions for each class (these two functions could of also be merged):

exception Typecase;;let is_point1 = function _, `Point1 q -> true | _ -> false;;let to_point1 = function _, `Point1 q -> q | _ -> raise Typecase;;

as well as a safe (statically typed) coercion point1.

let as_point1 = function pq -> (pq :> point1 * _);;

Similarly, we define two-dimensional points and their auxiliary functions:

class point2 x y = object inherit point1 x method gety = y + 0 end;;let p2 = new point2 2 2;;let pp2 = (p2 :> point1), `Point2 p2;;let is_point2 = function _, `Point2 q -> true | _ -> false;;let to_point2 = function _, `Point2 q -> q | _ -> raise Typecase;;let as_point2 = function pq -> (pq :> point2 * _);;

Finally, we check that objects of both classes can be collected together in a container.

let l = let ( @:: ) x y = (as_point1 x) :: y in pp1 @:: pp2 @:: [];;

Components that are common to all members of the collection can be accessed directly (without membership testing) using the first projection.

let getx p = (fst p)#getx;;List.map getx l;;

Conversely, other components must accessed selectively via the second projection and using membership and conversion functions:

let gety p = if is_point2 p then (to_point2 p) # gety else 0;;List.map gety l;;

B.2 Labeled arguments

In the core language, as in most languages, arguments are anonymous.

Labeled arguments are a convenient extension to the core language that allow to consistently label arguments in the declaration of functions and in their application. Labeled arguments increase safety, since argument labels are checked against their definitions. Moreover, labeled arguments also increase flexibility since they can be passed in a different order than the one of their definition. Finally, labeled arguments can be used solely for documentation purposes.

For instance, the erroneous exchange of two arguments of the same type —an error the typechecker would not catch— can be avoided by labeling the arguments with distinct labels. As an example, the module StdLabels.String provides a function sub with the following type:

StdLabels.String.sub;;

- : string -> pos:int -> len:int -> string = <fun>

This function expects three arguments: the first one is anonymous, the second and third ones are labeled pos and len, respectively. A call to this function can be written

String.sub "Hello" ~pos:0 ~len:4

or equivalently,

String.sub "Hello" ~len:4 ~pos:0

since labeled arguments can be passed to the function in a different order. Labels are (lexically) enclosed between ~ and :, so as to distinguish them from variables.

By default, standard library functions are not labeled. The module StdLabels redefines some modules of the standard library with labeled versions of some functions. Thus, one can include the command

open StdLabels;;

at the beginning of a file to benefit from labeled versions of the libraries. Then, String.sub could have been used as a short hand for StdLabels.String.sub in the example above.

Labeled arguments of a function are declared by labeling the arguments accordingly in the function declaration. For example, the labeled version of substring could have been defined as

let substring s ~pos:x ~length:y = String.sub s x y;;

Additionally, there is a possible short-cut that allows us to use the name of the label for the name of the variable. Then, both the ending : mark at the end of the label and the variable are omitted. Hence, the following definition of substring is equivalent to the previous one.

let substring s ~pos ~length = String.sub s pos length;;

B.3 Optional arguments

Labels can also be used to declare default values for some arguments.

Quick overview

Arguments with default values are called optional arguments, and can be omitted in function calls —the corresponding default values will be used. For instance, one could have declared a function substring as follows

let substring ?pos:(p=0) ~length:l s = String.sub s p l;;

This would allow to call substring with its length argument and an anonymous string, leaving the position to its default value 0. The anonymous string parameter has been moved as the last argument, inverting the convention taken in String.sub, so as to satisfy the requirement than an optional argument must always be followed by an anonymous argument which is used to mark the end optional arguments and replace missing arguments by their default values.

Application to class constructors

In OCaml, objects are created from classes with the new construct. This amounts to having a unique constructor of the same name as the name of the class, with the same arity as that of the class.

In object-oriented languages, it is common and often quite useful to have several ways of building objects of the same class. One common example are is to have default values for some of the parameters. Another situation is to have two (or more) equivalent representations for an object, and to be able to initialized the object using the object either way. For instance, complex points can be defined by giving either cartesian or polar coordinates.

One could think of emulating several constructors by defining different variants of the class obtained by abstraction and application of the original class, each one providing a new class constructor. However, this schema breaks modularity, since classes cannot be simultaneously refined by inheritance.

Fortunately, labeled arguments and variant types can be used together to provide the required flexibility, as it there were several constructors, but with a unique class that can be inherited.

For example, two-dimensional points can be defined as follows:

class point ~x:x0 ?y:(y0=0) () = object method getx = x0 + 0 method gety = y0 + 0 end;;

(The extra unit argument is used to mark the end of optional arguments.) Then, the y coordinate may be left implicit, which defaults to 0.

let p1 = new point ~x:1 ();;let p2 = new point ~x:1 ~y:2 ();;

Conversely, one could define the class so that

class point arg = let x0, y0 = match arg with | `Cart (x,y) -> x, y | `Polar(r,t) -> r *. cos t, r *. sin t in object method getx = x0 method gety = y0 end;;

Then, points can be build by either passing cartesian or polar coordinates

let p1 = new point (`Cart (1.414, 1.));;let p2 = new point (`Polar (2., 0.52));;

In this case, one could also choose optional labels for convenience of notation, but at the price of some dynamic detection of ill-formed calls:

class point ?x ?y ?r ?t () = let x0, y0 = match x, y, r, t with | Some x, Some y, None, None -> x, y | None, None, Some r, Some t -> r *. cos t, r *. sin t | _, _, _, _ -> failwith "Cart and Polar coordinates can't be mixed" in object method getx = x0 method gety = y0 end;;let p1 = new point ~x:2. ~y:0.52 ();;let p2 = new point ~r:1.414 ~t:0.52 ();;