Modules and classes play different roles. On the one hand, modules can be embedded, and parameterized by types and values. Modules also allow value and type abstraction on a large scale. On the other hand, classes provide inheritance and its late binding mechanism; they can also be parameterized by values, but on a small scale. At first, there seems to be little redundancy. Indeed, to benefit from both features simultaneously, modules and classes are often combined together.
However, both modules and classes provide means of structuring the code. Despite their resemblance at first glance, modular and object-oriented programming styles are in fact diverging: once a choice has been made to represent a structure by a module or by an object, many other choices are forced, which is sometimes bothersome.
In this Chapter, we discusses the overlapping of features and the specificities, and show how to use them in harmony.
Many abstract datatypes can be defined using either classes or modules. A representative example is the type of stacks. Stacks have been defined as a parametric class in section 3.1.2 where operations on stacks are methods embedded into stack-objects, and as a module defining an abstract type of stacks and the associated operations in section 4.1.1. The following table summarizes the close correspondence between those two implementations.
|class version||module version|
|The type of stacks|
|Create a stack|
|Push x on s||s|
|Pop from s||s|
More generally, all algebraic abstract types that are modifiable in place and such that all associated operations are unary (ie. they take only one argument of the abstract type) can be defined as classes or as modules in almost the same way. Here, the choice between those two forms is mainly a question of programming style: some programmers prefer to see the operations of the abstract type as methods attached to values of this type, while others prefer to see them as functions outside values of the abstract type.
Moreover, the two alternatives are still comparable when extending the stack
implementation with new operations. For instance, in both cases, a new
implementation of stacks can be derived from the older one with an
top to explore the top of the stack without popping
it. Both implementations, described in Figure 5.1, are
straightforward: the class approach uses inheritance while the module
approach uses the
include construct. (The effect of
Stack is to rebind all components of the
Stack structure in the substructure.)
However, the two approaches definitely differ when considering late binding,
which is only possible with the class approach. For instance, redefining the
pop would automatically benefit to the
top (ie. the method
top would call the new definition of
pop). This mechanism does
not have any counterpart in OCaml modules.
In conclusion, if inheritance is needed, the class approach seems more appropriate, and it becomes the only possible (direct) solution if, moreover, late binding is required. Conversely, for abstract datatypes used with binary operations, such as sets with a merge operation, the module approach will be preferable, as long as inheritance is not used. Furthermore, the module approach is the only one that allows to return private data to the user, but abstractly, so as to preserve its integrity. Of course, the module system is also the basis for separate compilation.
To benefit from the advantages of objects and modules simultaneously, an application can easily combine both aspect. Typically, modules will be used at the outer level, to provide separate compilation, inner structures, and privacy outside of the module boundaries, while classes will be components of modules, and offer extendibility, open recursion and late binding mechanisms.
We first present typical examples of such patterns, with increasing complexity and expressiveness. We conclude with a more complex —but real— example combining many features in an unusual but interesting manner.
The easiest example is probably the use of modules to simply group related
classes together. For instance, two classes
are related by their usage, can be paired together in a module.
Besides clarity of code, one quickly take advantage of such grouping. For
cons classes can be extended
simultaneously (but this is not mandatory) to form an implementation
In turn, the module
List can also be extended —but in another
direction— by adding a new “constructor”
append. This amounts to adding a new class with the same interface as
that of the first two.
In this example, grouping could be seen as a structuring convenience, because a flattened implementation of all classes would have worked as well. However, grouping becomes mandatory for friend classes.
State encapsulation in objects allows to abstract their representation by hiding all instance variables. Thus, reading and writing capabilities can be controlled by providing only the necessary methods. However, whether to expose some given part of the state is an all-or-nothing choice: either it is confined to the object or revealed to the whole world.
It is often the case that some, but not all, objects can access each other's state. A typical example (but not the only one) are objects with binary methods. A binary method of an object is called with another object of the same class as argument, so as to interact with it. In most cases, this interaction should be intimate, eg. depend on the details of their representations and not only on their external interfaces. For instance, only objects having the same implementation could be allowed to interact. With objects and classes, the only way to share the representation between two different objects is to expose it to the whole world.
Modules, which provide a finer-grain abstraction mechanism, can help secure this situation, making the type of the representation abstract. Then, all friends (classes or functions) defined within the same module and sharing the same abstract view know the concrete representation.
This can be illustrated on the bank example, by turning currency into a class:
Then, all object of the class
c can be combined, still hiding the
A similar situation arises when implementing sets with a union operation, tables with a merge operation, etc.
We end this Chapter with an example that interestingly combines some features of classes objects and modules. This example is taken from the algebraic-structure library of the formal computation system FOC . The organization of such a library raises important problems: on the one hand, algebraic structures are usually described by successive refinements (a group is a monoid equipped with an additional inverse operation). The code structure should reflect this hierarchy, so that at least the code of the operations common to a structure and its derived structures can be shared. On the other hand, type abstraction is crucial in order to hide the real representations of the structure elements (for instance, to prevent from mixing integers modulo p and integers modulo q when p is not equal to q). Furthermore, the library should remain extensible.
In fact, we should distinguish generic structures, which are abstract algebraic structures, from concrete structures, which are instances of algebraic structures. Generic structures can either be used to derive richer structures or be instantiated into concrete structures, but they themselves do not contain elements. On the contrary, concrete structures can be used for computing. Concrete structures can be obtained from generic ones by supplying an implementation for the basic operations. This schema is sketched in figure 5.2. The arrows represent the expected code sharing.
In general, as well as in this particular example, there are two kinds of expected clients of a library: experts and final users. Indeed, a good library should not only be usable, but also re-usable. Here for instance, final users of the library only need to instantiate some generic structures to concrete ones and use these to perform computation. In addition, a few experts should be able to extend the library, providing new generic structures by enriching existing ones, making them available to the final users and to other experts.
The first architecture considered in the FOC project relies on modules, exclusively; modules facilitates type abstraction, but fails to provide code sharing between derived structures. On the contrary, the second architecture represents algebraic structures by classes and its elements by objects; inheritance facilitates code sharing, but this solution fails to provide type abstraction because object representation must be exposed, mainly to binary operations.
The final architecture considered for the project mixes classes and modules
to combine inheritance mechanisms of the former with type abstraction of
the latter. Each algebraic structure is represented by a module
with an abstract type
t that is the representation type of
algebraic structure elements (ie. its “carrier”). The object
which collects all the operations, is
obtained by inheriting from the virtual class that is parameterized by the
carrier type and that defines the derived operations. For instance, for
groups, the virtual class
group declares the basic group operations
opposite) and defines the
derived operations (
minus) once and for all:
A class can be reused either to build richer generic structures by adding other operations or to build specialized versions of the same structure by overriding some operations with more efficient implementations. The late binding mechanism is then used in an essential way.
(In a more modular version of the group structure, all methods would be private, so that they can be later ignored if necessary. For instance, a group should be used as the implementation of a monoid. All private methods are made public, and as such become definitely visible, right before a concrete instance is taken.)
A group is a module with the following signature:
To obtain a concrete structure for the group of integers modulo p, for
example, we supply an implementation of the basic methods (and possibly some
specialized versions of derived operations) in a class
z_pz_impl. The class
z_pz inherits from the class
group that defines the derived operations and from
z_pz_impl that defines the basic operations. Last, we
include an instance of this subclass in a structure so as to hide the
representation of integers modulo p as OCaml integers.
This representation elegantly combines the strengths of modules (type abstraction) and classes (inheritance and late binding).
On the one hand, it should include more generic structures, starting with sets, and up to at least rings and polynomials.
On the other hand, it should improve on the model given above, by inventing a more sophisticated design pattern that is closer to the model sketched in figure 5.2 and that can be used in a systematic way.
For instance, the library could provide both an open view and the abstraction functor for each generic structure. The open view is useful for writing extensions of the library. Then, the functor can be used to produce an abstract concrete structure directly from an implementation.
The pattern could also be improved to allow a richer structure (eg. a ring) to be acceptable in place only a substructure is required (eg. an additive group).
The polynomials with coefficients in Z /2Z offers a simple yet interesting source of examples.
The example of the FOC system illustrates a common situation that calls for hybrid mechanisms for code structuring that would more elegantly combine the features of modules and classes. This is an active research area, where several solutions are currently explored. Let us mention in particular “mixin” modules and objects with “views”. The former enrich the ML modules with inheritance and a late binding mechanism [18, 4, 5]. The latter provide a better object-encapsulation mechanism, in particular in the presence of binary operations and “friend” functions; views also allow to forget or rename methods more freely [68, 71].
Other object-oriented languages, such as CLOS, detach methods from objects, transforming them into overloaded functions. This approach is becoming closer to traditional functional programming. Moreover, it extends rather naturally to multi-methods [13, 22, 8] that allow to recover the symmetry between the arguments of a same algebraic type. This approach is also more expressive, since method dispatch may depend on several arguments simultaneously rather than on a single one in a privileged position. However, this complicates abstraction of object representation. Indeed, overloading makes abstraction more difficult, since the precise knowledge of the type of arguments is required to decide what version of the method should be used.