English version
Accueil     À propos     Téléchargement     Ressources     Contactez-nous    

Ce site est rarement mis à jour. Pour les informations les plus récentes, rendez-vous sur le nouveau site OCaml à l'adresse ocaml.org.

Browse thread
[Caml-list] Syntax
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2002-02-05 (14:19)
From: Gerard Huet <Gerard.Huet@i...>
Subject: [Caml-list] Syntax
At 13:05 05/02/02 +0100, Daniel de Rauglaudre wrote: 
>On Tue, Feb 05, 2002 at 12:53:39PM +0100, Remi VANICAT wrote: 
>> So the fact that a library or a tool is written in revised or 
>> standard library doesn't change anything for those who use it. The 
>> only problem is for those who want to change the source, but it's 
>> not so hard to learn this syntax if you want to use it. 
It is trivial to switch to revised syntax. It will take you one week to 
get used to True/False vs true/false, to use square brackets in [x :: l], 
and to write (list int) instead of (int list). I never missed the double 
semicolon, and many problems will ill-parenthesised matches just went into 
oblivion. What I missed most is that you can't anymore step through the 
toplevel with mouse cut-and-paste, because the inner let's are not accepted 
at toplevel, you have to write the godamn "value" keyword instead.
Until recently there was no point in pushing the revised syntax, because it 
was very hard to teach the language when the system's printer used another 
syntax than what you typed in. Now that there is smooth integration of 
camlp4 with ocaml with the 3.04, there is no excuse not to use the much 
superior revised syntax, in my opinion. Which does not mean that there 
should be a big concerted effort to switch all our code from one syntax to 
the next. 
I am currently developing a computational linguistics platform in ocaml, now
around 12000 loc in about 50 modules, mostly in revised syntax, some others 
borrowed from other sources or mechanically produced and still in vanila 
syntax, so what. The makefile does what is needed, and I never have any 
problem reading other people's programs from libraries or the hump, etc. 

The crucial point is that we need good tutorials, reference manuals, and
in the revised syntax before being serious about "standardizing" in something 
else than the usual syntax. Once this material exists, then we can talk. 

Let me tell you about an experience I did recently, in using Ocaml as a 
publication language. In the first version of my paper, I said "we shall 
use as algorithmic meta-language OCaml under so-called revised syntax". 
Now one referee got very interested in the code, tried it on his machine
with ocaml 3.02, missed my remark about revised syntax, and said "it is too
bad these are not real programs which people can directly use". Another
referee got completely turned off by the code and said "remove all
proselytism about this weird Ocaml stuff, and use some pidgin algorithmic
language". Typical dilemna. 

What I ended up was writing a short presentation of an algorithmic 
meta-language which I called "Pidgin ML", without any proselytism about 
existing programming languages; it is only in the evaluation part of the 
paper that I reveal that Pidgin ML can be compiled and executed as such by
OCaml+Camlp4. So everybody is happy !

Just as a short plug for revised syntax, here is my introduction:

We shall use as {\sl meta language} for the description of our algorithms 
a pidgin version of the functional language ML. Readers familiar with ML 
may skip this section, which gives a crash overview of its syntax and 

The core language has types, values, and exceptions. 
Thus, \verb:1: is a value of predefined type \verb:int:, whereas 
\verb:"CL": is a \verb:string:. 
Pairs of values inhabit the corresponding product type. Thus: 
\verb|(1,"CL") : (nat * string)|. 
Recursive type declarations create new types, 
whose values are inductively built from the associated constructors. 
Thus the Boolean type could be declared as a sum by: 
\verb:type bool = [True | False];:\\ 
Parametric types give rise to polymorphism. 
Thus if \verb:x: is of type \verb:t: and \verb:l: is of type 
\verb:(list t):, we construct the list adding \verb:x: to \verb:l: 
as \verb|[x :: l]|. The empty list is \verb:[]:, of (polymorphic) type 
\verb:(list 'a):. Although the language is strongly typed, explicit type 
specification is rarely needed from the designer, since principal types 
may be inferred mechanically.

The language is functional in the sense that functions are first class 
objects. Thus the doubling integer function may be written as 
\verb:fun x -> x+x:, and it has type \verb:int -> int:. It may be associated 
to the name \verb:double: by declaring: \verb:value double = fun x -> x+x;:\\ 
Equivalently we could write: \verb:value double x = x+x;:\\ 
Its application to value \verb:n: is written as \verb:(double n): or even 
\verb:double n: when there is no ambiguity. Application associates to the 
left, and thus \verb:f x y: stands for \verb:((f x) y):. 
Recursive functional values are declared with the keyword \verb:rec:. 
Thus we may define the factorial function as:\\ 
\verb:value rec fact n = n*(fact (n-1));:\\ 
Functions may be defined by pattern matching. Thus the first projection of 
pairs could be defined by:\\ 
\verb:value fst = fun [ (x,y) -> x ];:\\ 
or equivalently (since there is only one pattern in this case) by:\\ 
\verb:value fst (x,y) = x;:\\ 
Pattern-matching is also usable in \verb:match: expressions which generalise 
case analysis, 
such as: \verb:match l with [ [] -> True | _ -> False ]:, which 
tests whether list \verb:l: is empty, using underscore as catch-all 

Evaluation is strict, which means that \verb:x: is evaluated before 
\verb:f: in the evaluation of \verb:(f x):. The \verb:let: expressions 
permit to sequentialise computation, and to share sub-computations. Thus 
\verb:let x = fact 10 in x+x: will compute \verb:fact 10: first, 
and only once. 
An equivalent postfix \verb:where: notation may be used as well. Thus 
the conditional expression \verb:if b then e1 else e2: is equivalent to: 
\verb:choose b where choose = fun [ True -> e1 | False -> e2]:.

Exceptions are declared with the type of their parameters, like in: 
\verb:exception Failure of string;:
An exceptional value may be raised, like in: 
\verb:raise (Failure "div 0"): and handled by a \verb:try: switching on 
exception patterns, such as:
\verb:try expression with [ Failure s -> ... ]:.
Other imperative constructs may be used, such as 
references, mutable arrays, while loops and I/O commands, 
but we shall seldom need them. Sequences of instructions are 
evaluated in left to right regime in \verb:do: expressions, such as: 
\verb:do {e1; ... en}:. 

ML is a {\sl modular} language, in the sense that sequences of type, value 
and exception declarations may be packed in a structural unit called a 
\verb:module:, amenable to separate treatment. 
Modules have types themselves, called {\sl signatures}. Parametric 
modules are called {\sl functors}. The algorithms presented in this paper 
will use in essential ways 
this modularity structure, but the syntax ought to be self-evident.

Readers uninterested in computational details may think of ML 
definitions as recursive equations over inductively defined algebras. Most 
of them are simple primitive recursive functionals. 
At this point, note that Pidgin ML has no objects (not even records!) nor

And later on I spill the beans:
Pidgin ML definitions may actually be directly executed as Objective Caml 
programs \cite{ocaml}, under the so-called revised syntax \cite{camlp4}. 

>> By the way, is there any caml-mode for Emacs and the revised syntax ? 
>I don't think so but there is a request for that (Gérard Huet asked 
>me yesterday). I use a very old caml-light-or-what emacs mode and I 
>am too lazy to look at emacs-lisp to create my mode.

This is an important point. I myself use a slighly hacked version of Tuareg 
as an interim solution. I shall have a look at otags, which, being built 
with camlp4 support, ought to be parametrizable (?).

I suggest this syntax problem should be seriously considered, but as a 
long-term effort, encompassing development tools and documentation and 
training material. This is not a battle that can be won by one round of 
email flame.


Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr