Compiler feature  useful or not?
[
Home
]
[ Index:
by date

by threads
]
[ Message by date: previous  next ] [ Message in thread: previous  next ] [ Thread: previous  next ]
[ Message by date: previous  next ] [ Message in thread: previous  next ] [ Thread: previous  next ]
Date:  20071116 (00:30) 
From:  Yaron Minsky <yminsky@c...> 
Subject:  Re: [Camllist] Compiler feature  useful or not? 
Most of what you said about quotient types made sense to me, but I must admit to being deeply confused about the nature of the change to the language. Consider the following trivial module and two possible interfaces. module Int = struct type t = int let of_int x = x let to_int x = x end module type Abs_int = sig type t val of_int : int > t val to_int : t > int end module type Priv_int = sig type t = private int val of_int : int > t val to_int : t > int end So, what is the difference between (Int : Abs_int) and (Int : Priv_int)? For instance, can I write (Priv_int.of_int 3) + (Priv_int.of_int 5)? Can you point to anything concrete I can do with the private version that I can't do with the abstract version? Is there any expression that is legal for one but not for the other? y On Nov 15, 2007 8:26 AM, Pierre Weis <pierre.weis@inria.fr> wrote: > > > a value of type row is in fact a concrete integer (it is not hidden > in > > >any way), > > > > But you cannot apply integer operations to it, so it is not a normal > > concrete integer, right? > > Right: a value of type row has type row ... not type int! > > > Can you show an example where replacing all "type t = private ..." > > definitions by "type t" changes a welltyped program into an illtyped > > one? I understand that if private types create real subtypes (w.r.t. > > :>) then they are different than abstract types, but otherwise, I don't > > see the difference for the typechecker. > > May be, I must recall what are private types in the first place: private > types has been designed to implement quotient data structure. > > (*** Warning. > > Wao: after rereading this message I realize that it is really long. > > You can skip it, if you already know something about mathematical quotient > structures, private types for record and variant, relational types and the > mocac compiler! > > ***) > > What is a quotient ? >  > > Here quotient has to be understood in the mathematical sense of the term: > given a set S and an equivalence relation R, you consider the set S/R of > the > equivalence classes modulo the relation R. S/R is named the quotient > structure of S by R. Quotient structures are fundamental in mathematics > and > there properties have been largely studied, in particular to find the > relationship between operations defined on S and operations on S/R: which > operations on S can be extended to operations on S/R ? Which properties of > operations on S are still valid for there extension on S/R ? and so on. > > Simple examples of quotient structures can be found everywhere in maths, > for > instance consider the equivalence relation R on pairs of integer values > defined as > > z_1 equiv z_2 if and only if z_1 and z_2 are both odd or are both even > (in Caml terms > z_1 equiv z_2 if and only if (z_1 mod 2) = (z_2 mod 2)) > > The set Z/R is named Z/2Z and it inherits properties of operations on Z: > it > is an abelian group (and more). > > A wonderful example of such inheritance of interesting properties by > inheritance of operations thanks to a definition by a quotient structure > is > the hierarchy of sets of numbers: starting with N (the set of natural > numbers) we define Z (the set of relative integer numbers) as a quotient > of > NxN), then Q (the rational numbers) as a quotient of ZxZ*, R (the set of > real > numbers) as a quotient of Q^N (the sets of sequences of rational numbers), > C > (the set of complex numbers) as a quotient of R[X] (the set of polynomials > with one unknown and real coefficients). Note here that at each step of > the > hierarchy the quotient structure is richer (has more properties and/or > more > elements) than the original structure: first we have a monoide, then a > group > and a ring, then a field, then a complete field and so on. > > Why quotient structures ? >  > > So quotient structures are a fundamental tool to express mathematical > structures and properties. Exactly as mathematical functions, relations > and > sets. As you may have noticed, programming languages are extremely related > to > maths (due to their purely logical bases) even if, in some cases, the > zealots > of the language at hand try to hide the mathematical fundation into a > strange > anthropomorphic discurse to describe their favorite language features. > > In the end, the computer programing languages try hard to incorporate > powerful features from mathematics, because these features have been > polished > by mathematicians for centuries. As a consequence of this work, we know > facts, properties and theorems about the mathematical features and this is > an > extremely valuable benefit. > > Now, what is the next frontier ? What can we steal more to mathematics for > the > benefit of our favorite programming language ? > > If we review the most powerful tools of mathematicians, we found that > functions have been borrowed (hello functional programming, > lambdacalculus > and friends :)), relations have been borrowed (data bases, hash tables, > association lists), sets have been more or less borrowed (hello setle, > pascal, and set definition facilities from various libraries of various > languages...). More or less, we have all those basic features. > > >From the mathematical set construction tools, we have got in Caml: > >  the cartesian product of sets (the * binary type constructor, the record > type definitions), >  the disjunct union of sets (the  of polymorphic variants, the sum (or > variant) type definitions). > > Unfortunately, we have no powerful way to define a quotient data > structure. That what private type definitions have been designed to do. > > What do we need for a quotient data structure ? >  > > In the first place, we need the ``true'' thing, the real feature that is > indeed used in maths. Roughly speaking this means to assimilate the > quotient > set S/R to a subset of S. > > In the previous definition of quotient structures, there is a careful > distinction between the base set S and the quotient set S/R. In fact, > there > always exists a canonical injection from S to S/R, and if we choose a > canonical representant in each equivalence class of S/R, we get another > canonical injection from S/R to S, so that S/R can be considered as a > subset > of S (the story is a bit more complex but that's the idea). This > injection/projection trickery is intensively used in maths; for instance, > in > the hierarchy of number sets, we say and consider that N is a subset of Z > that itself is a subset of Q, a subset of R, a subset of C. Rigourously, > we > should say for instance, there is a subset of Z that is canonically > isomorphic to N; in fact, we abusively assimilate N to this subset of Z; > hence, we say that N is ``included'' in Z, when we should say ``the image > of > N by the canonical isomorphism from N to Z'' is included in Z; then, we go > one step further in the assimilation of N to a subset of Z, by denoting > the > same, the elements of N and there image in Z; for instance, ``the neutral > element of the multiplication in Z'' and the successor of 0 in N is > denoted > ``1''; and we now can say that 1 belongs to Z. Note here that, in the > first > place, ``the neutral element of the multiplication in Z'' is an > equivalence > class (as all elements in Z are). So it is a set. More exactly, the > ``neutral > element of the multiplication in Z'' is the infinite set of pairs of > natural > numbers (n, m) such that n  m = 1 (here ``'' is an operation in N and > ``1'' > is the successor of the natural number ``0'', so that n  m = 1 is a > shorthand for n = succ m). The assimilation between N and its isomorphic > image allows to use 1 as the denotation of this infinite set of pairs of > natural numbers. > > We understand why the mathematicians always write after having designed a > quotient structure: ``thanks to this isomorphism, and if no confusion may > arise, we always assimilate S to its canonical injection in S/R''. > > This assimilation is what private type definitions allow. > > How do we define a quotient data structure ? >  > > The mathematical problem: >  we have a set S and an equivalence relation R on SxS, >  we construct the quotient S/R. >  we state afterwards: > ``if no confusion may arise, we always assimilate S to its canonical > injection in S/R''. > > The programming problem: >  we have a data structure S (defined by a type s) and a relation R on SxS > (defined by a function r from s * s to bool). >  we construct a data structure that represents S/R. >  we have afterwards: > ``No confusion may arise, we always assimilate S to its canonical > injection > in S/R''. > > The private data type solution: >  the data structure S is defined as any Caml data structure and the > relation R is implemented by the canonical injection(s) from S to S/R. >  the canonical projection from S/R to S is automatic. >  S (defined by s) is assimilated to S/R (which is then also s!). > > We defined S/R as a subset of S, the set of canonical representants of > equivalence classes of S/R. > > More exactly, the canonical injection from S to S/R maps each element of S > to > its equivalent class in S/R; if we assimilate each equivalence class to > its > canonical representant (an element of S), the canonical injection maps > each > element of S to the canonical representant of its equivalence class. Hence > the canonical injection has type S > S. > > An example treated without private types: >  > > Let's take a simple example: > > S is the following data structure that implements a mathematical (free) > structure > generated by the constant 0, the unary symbol succ, and the binary symbol > +. > > type peano = >  Zero >  Succ of peano >  Plus of peano * peano > > R is the (equivalence) relation that expresses that >  0 is the neutral element for + (for all x in S, 0 + x = x and x + 0 = > x), >  + is associative (for all x, y, z in S³, (x + (y + z)) = ((x + y) + z)). > > The canonical injection is a function from peano > peano that maps each > value > in S (the type peano) to the canonical representant of its class in S/R > (an > element of S (the type peano)): > > let rec make = function >  Zero > Zero >  Plus (Succ n, p) > Succ (make (Plus (n, p))) >  Plus (Zero, p) > p >  Plus (p, Zero) > p >  Plus (Plus (x, y), z) > make (Plus (x, make (Plus (y, z)))) >  Succ p > Succ p > ;; > > (This function may be wrong but you see the idea :)) > > So, if you want to represent S/R for peano in Caml you must: >  (1) define the type peano >  (2) always use the make function to create a value in S/R >  (3) not to confuse S and S/R in your head (I mean in your programs) > > Private data types permits (1), ensures (2), and helps for (3) concerning > the > head part and ensures (3) for programs by means of compiletime type > errors. > > The same example with private types: >  > > To define a private data type you must define a module. >  in the implementation, you define the carrier S and its canonical > injection. >  in the interface, you export the carrier and the injection. > > The type checker will ensure transparent projection from the quotient to > the > carrier and mandatory use of the canonical projection to build a value in > S/R. > > interface peano.mli >  > type peano = private >  Zero >  Succ of peano >  Plus of peano * peano > ;; > > val zero : peano;; > val succ : peano > peano;; > val plus : peano * peano > peano;; > > implementation peano.ml >  > type peano = >  Zero >  Succ of peano >  Plus of peano * peano > ;; > > let rec make = function > ... > ;; > > let zero = make Zero;; > let succ p = make (Succ p);; > let plus (n, m) = make (Plus (n, m));; > > (See note (1) for a discussion on the design of this example.) > > What is given by private types: >  > >  You cannot create a value of S/R if you do not use the canonical > injection > # Zero;; > Cannot create values of the private type peano > >  As a consequence, values in S (i.e. S/R) are always canonical: > # let one = succ zero;; > val one : M.peano = Succ Zero > # let three = plus (one, succ (plus (one, zero)));; > val three : M.peano = Succ (Succ (Succ Zero)) > >  Projection is automatic > # let rec int_of_peano = function >  Zero > 0 >  Succ n > 1 + int_of_peano n >  Plus (n, p) > int_of_peano n + int_of_peano p > ;; > val int_of_peano : M.peano > int = <fun> > # int_of_peano three;; >  : int = 3 > > What about private abbreviations ? >  > > Private abbreviation definitions are a natural extension of private data > type > definitions to abbreviation type definitions. The same notions are carried > out mutatismutandis: > >  we have a data structure S (defined by a type s) and a relation R on SxS > (defined by a function r from s * s to bool). >  we construct a data structure that represents S/R. >  we have afterwards: > ``No confusion may arise, we always assimilate S to its canonical > injection > in S/R''. > > In the case of abbreviations: > >  the data structure S/R is defined by a type s (which is an abbreviation) > and > a relation defined by a function, >  the canonical injection should be defined in the implementation file of > the > private data type module, >  we always assimilate S to its canonical injection in S/R. > > In pratice, as for usual private types (for which the constructive part of > the data type is not available outside the implementation), the type > abbreviation > is not known outside the implementation (it is really private to its > implementation). This prevents the construction of values of S/R without > using the canonical injection. > > Th noticeable difference is the projection function: it cannot be fully > implicit (otherwise, as you said Alain, the assimilation will turn to > complete > confusion: we would have S identical to S/R, hence row=int and no > difference > between a regular abbreviation definition and a private abbreviation > definition). If not implicit, the injection function should granted to be > the > identity function (something that we would get for free, if we allow > projection via subtyping coercion). > > In short: abstract data types provide values that cannot be inspected nor > safely manipulated without using the functions provided by the module that > defines the abstract data type. In contrast, private data types are > explicitely concrete and you can freely write any algorithm you need. A > good > test is printing: you can always write a function to print values of a > private type, it is up to the implementor of an abstract type to give you > the > necessary primitives to access the components of the abstracted values. > > Automatic generation of the canonical injection: >  > > You may have realized that writing the canonical injection can be a real > challenge. > > The moca compiler (see http://moca.inria.fr/) helps you to write the > canonical injection by generating one for you, provided you can express > the > injection at hand via a set of predefined algebraic relations (and/or > rewrite > rules you specify) attached to the constructors of the private type. > Private > types with constructors having algebraic relations are named relational > types. Moca generated canonical injections for relation types. > > For instance, you would write the peano example as the following peano.mlm > file: > > type peano = private >  Zero >  Succ of peano >  Plus of peano * peano > begin > associative > neutral (Zero) > rule Plus (Succ n, p) > Succ (Plus (n, p)) > end;; > > The mocac compiler will generate the interface and implementation of the > peano module for you, including the necessary canonical injection > function. > > Best regards, > >  > Pierre Weis > > INRIA Rocquencourt, http://bat8.inria.fr/~weis/<http://bat8.inria.fr/%7Eweis/> > > Note (1): >  we can't just export the canonical injection since you could not build > any > value of the type without previously defined values! >  we provide specialized versions of the canonical injection function > introducing the convention that the lowercase name of a value constructor > is > the name of its associated canonical injection. >  we do not export the plasin true canonical injection since it would be > more > confusing than useful. > > _______________________________________________ > Camllist mailing list. Subscription management: > http://yquem.inria.fr/cgibin/mailman/listinfo/camllist > Archives: http://caml.inria.fr > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/camlbugs >