Installing parsers and printers

Pierre Weis (weis@pauillac.inria.fr)
Tue, 22 Mar 1994 17:20:07 +0100 (MET)

From: Pierre Weis <weis@pauillac.inria.fr>
Message-Id: <9403221620.AA12874@pauillac.inria.fr>
Subject: Installing parsers and printers
To: caml-list@pauillac.inria.fr
Date: Tue, 22 Mar 1994 17:20:07 +0100 (MET)

This is a long answer to John Harrison.
Abstract:
These kind of facilities are partly supported by Caml Light and completely
available in Caml V3.1. Moreover the development of this kind of
object languages features has been one of the great speciality of our
Caml team (and it is still an active research area today).

> o I'd like to be able to install a custom printer to display objects of a
> nominated type (in the toplevel loop). I have a logic encoded as a
> recursive type. Of course this is unreadable in its raw form, and it's
> very tedious to have to call a prettyprinter explicitly. Ideally I'd
> like to be able to do something like:
>
> #install_printer "term" "term_printer";;
>
> (where `term_printer' is an ML function I have defined), and then have all
> objects of type `term' displayed using that printer. If something this clean
> is not possible I'm willing to dive into C if somebody can advise me on how
> to interface to the system.

This possibility is available in both Caml V3.1 and Caml Light.
In Caml Light the directive has almost the same syntax as yours:

new_printer "term" term_printer;;

In Caml V3.1 you just type in:

#printer term_printer;;

the system is able to find out the type of term_printer by itself...

> #install_parser "<<" ">>" "term_parser";;

This is available in Caml V3.1 with the grammar feature, it's called
``set default grammar'' and is used like in this session:

We start by setting the default user grammar to the predefined parser
of Caml for expressions (hence the Caml:Expr annotation, meaning the entry
point named Expr, in the grammar named Caml):

CAML (decstation) (V3.1.2) by INRIA Thu Feb 4

##set default grammar Caml:Expr;;
Default grammar is now Caml:Expr
Pragma () : unit

Now we can parse Caml expressions either directly using the << >> notation:

#<<let x=1 in x + 1>>;;
<:Caml:Expr< let x = 1 in x+1 >> : ML

or by calling directly the proper entry point for the grammar:

#<:Caml:Expr< let x = 1 in x + 1>>;;
<:Caml:Expr< let x = 1 in x+1 >> : ML

As you may have notice the toplevel has already a default
pretty-printer for values of type ML (the type of Caml expressions
abstract syntax trees). If we turn off this pretty-printer and reset
the default way of printing, we can see the real abstract syntax tree of
pieces of Caml concrete syntax:

##default printer for type ML;;
() : unit

Now:

#<< let x = 1 in x + 1>>;;
(MLin
((MLlet ((MLvarpat "x"),(MLconst (mlint 1)))),
(MLapply ((MLvar "+"),(MLpair ((MLvar "x"),(MLconst (mlint 1)))),[])))) :
ML

We can define another pretty-printer for values of the type ML. For
instance ``print_caml_expr'' which is predefined too and which does'nt
print the french brackets << and >>:

#print_caml_expr;;
<fun> : ML -> unit

#let s = << let rec x = 1 in z - t>>;;
Value s is
(MLin
((MLrec (MLlet ((MLvarpat "x"),(MLconst (mlint 1))))),
(MLapply ((MLvar "-"),(MLpair ((MLvar "z"),(MLvar "t"))),[])))) :
ML

##printer print_caml_expr;;
New printer defined for type: ML
() : unit

#s;;
let rec x = 1 in z-t : ML

Just for fun, let use another predefined grammar in Caml, to get
escapes. We reuse the AST bound to s inside another AST:

#<:CAML<let z = 1 in #s>>;;
let z = 1 in let rec x = 1 in z-t : ML

And to end with, let's obtain the Caml syntax of a Caml type, using
two predefined grammars: the gtype grammar of concrete representation of types
and the Caml grammar (we now compose grammars)...

#let ast_of_int_arrow_int = <:Caml < <:gtype< int -> int>> >>;;

Remember that this is really Caml abstract syntax:
##default printer for type ML;;
() : unit

#ast_of_int_arrow_int;;
(MLapply
((MLvar "Gconsttype"),
(MLpair
((MLconst (mlsystyp "->")),
(MLlist
((MLapply
((MLvar "Gconsttype"),
(MLpair ((MLconst (mlsystyp "int")),(MLvar ""))),[])),
[(MLapply
((MLvar "Gconsttype"),
(MLpair ((MLconst (mlsystyp "int")),(MLvar ""))),[]))])))),
[])) :
ML

I admit that this is a bit complex to understand, but notice how the
concrete notation is compact, compared to what we ought to enter if
this feature were not available.

>It is interesting to note that SML/NJ originally had no such facilities, but
>they have been added by Konrad Slind -- see his paper in the Second ML
>Workshop: "Object Language Embedding in Standard ML of New Jersey".

It is even more interesting to notice that a Yacc interface has been
added to Caml in 1984 by Philippe Lechenadec, and that this interface
allows user's defined grammars inside quotations " " (this has been
abandonned when the double quote symbol has been used for strings!).
Moreover the grammar feature as presented above was implemented in Caml by
Michel Mauny during the years 1987-1988...

Now this have been abandoned in Caml Light for two main reasons: first
the implementation had to remain small, and these features are
complex; second this was mainly considered as a toplevel feature, and
not very important compared to the development of a good compiler
featuring an easy separate compilation facility.

Nowadays, Michel Mauny and Daniel de Rauglaudre are working on a new
version of Caml Light featuring concrete language features. This may
be available some day ...

Pierre Weis