Re: pretty-printing facilities

Pierre Weis (weis@pauillac.inria.fr)
Mon, 13 Nov 1995 19:19:44 +0100 (MET)

From: Pierre Weis <weis@pauillac.inria.fr>
Message-Id: <199511131819.TAA17414@pauillac.inria.fr>
Subject: Re: pretty-printing facilities
To: Jocelyn.Serot@lasmea.univ-bpclermont.fr (Jocelyn Serot)
Date: Mon, 13 Nov 1995 19:19:44 +0100 (MET)
In-Reply-To: <199511130810.JAA09346@concorde.inria.fr> from "Jocelyn Serot" at Nov 13, 95 09:11:56 am

> Is there some written and available examples for the use of
> the pretty-printing facilities provided by the format module of the
> caml 7.0 standard library ?..

Well, I wrote twice the documentation for this library (and twice the
library itself !): the first version of the documentation was more
than 20 pages long, and people took it as for too complex. Then I wrote
the documentation for Caml Light which is just a few lines long, and
many people feel it far too scarce: ``en tout l'exce`s est un
defaut''.

So, let's explain a bit more: the pretty-printing facility provided by
the ``format'' module is used to get a fancy display for printing
routines. ``format'' provides a ``pretty-printing engine'' that is
intended to break lines in a nice way (let's say ``automatically
when it is necessary'').

Boxes:
------
Breaking of lines is based on 2 concepts:

* boxes : a box is a logical pretty-printing unit, which defines a
behaviour of the pretty-printing engine.
* breaks : a break is a hint given to the pretty-printing engine, to
tell it where to break lines. Otherwise the pretty-printing engine
never break lines (except ``in case of emergency'' to avoid very bad
output). (In addition, when the pretty printing engine starts a new
line, there are rules to fix the indentation of the new line (the
leading spaces at the beginning of the line).)

There are 4 types of boxes. (The most often used is the last type
(hovbox), so skip the rest at first reading).

* horizontal box (as obtained by the open_hbox procedure): in this box
breaks do not lead to line breaks.
* vertical box (obtained by the open_vbox procedure): every break
hint lead to a new line.
* vertical/horizontal box (as obtained by the open_hvbox procedure): if
it is possible, the entire box is written on a single line; otherwise
every break hint lead to a new line.
* vertical or horizontal box (as obtained by the open_hovbox procedure):
break hints are used to cut the line when there is no more room on the line.

Let's give an example. Suppose we can write 10 chars before the right
margin (that indicates no more room). We represent any char as ``-'',
``['' and ``]'' indicates openning and closing of box and ``b'' stands
for a break hint found in the ouput by the pretty-printing engine.

The output "--b--b--" is displayed like this (the b symbol stands for
the value of the break that is explain below):
* within a hbox:
--b--b--
* within a vbox:
--b
--b
--
* within a hvbox:
--b--b-- if there is enough room to print the box on the line
but "---b---b---" that cannot fit on the line is written
---b
---b
---
* within a hovbox:
--b--b-- if there is enough room to print the box on the line
but if "---b---b---" cannot fit on the line, it is written as
---b---b
---
The first break does not lead to a new line, since there is enough
room on the line. The second one leads to a new line since there is
no more room to print the material following it.
(if the room left on the line were even shorter, the first break hint
may lead to a new line and "---b---b---" is written as:
---b
---b
---
)

Printing spaces:
----------------

Break hints are also used to output spaces (if the line is not split
when the break is encountered, otherwise the new line indicates
properly the separation between printing items).
You output a break hint using print_break(sp, indent), and this sp
integer is used to print ``sp'' spaces. Thus print_break (sp,...) may
be thought as: print sp spaces or output a new line

For instance, if b is ``break (1, 0)'' in the output "--b--b--", we get
* within a hbox:
-- -- --
* within a vbox:
--
--
--
* within a hvbox:
-- -- -- or (according to the remaining room on the line)
--
--
--

* and similarly for hov boxes.

Generally speaking, a printing routine using "format", has not to output
white spaces: the routine should use break hints instead.
(for instance ``print_space ()'' that is a convenient abbrev for
print_break (1, 0) and outputs a single space or break the line.)

Indentation of new lines:
-------------------------
The user gets 2 ways to fix the indentation of new lines:

* when defining the box where it occurs: when opening a box, you may fix the
indentation added to each new line opened within the box.
For instance: ``open_hovbox 1'' opens a hovbox with new lines
indented 1 more than the initial indentation of the box.
With output "---[--b--b--b--", we get:
---[--b--b
--b--
with ``open_hovbox 2'', we get
---[--b--b
--b--
(Note: the [ sign in the display is actually not visible on the
screen, it is just there to materialize the aperture of the
pretty-printing box. Last ``screen'' stands for:
-----b--b
--b--
)
* when defining the break that makes the new line. As said above, you
output a break hint using print_break(sp, indent). The indent integer
is used to fix the indentation of the new line. Namely it is added to
the default indentation offset of the box where the break occurs. For
instance:

For instance, if [ stands for the opening of a hov box 1 (as
obtained by open_hovbox 1), and b is print_break (1,2), then we get from
output "---[--b--b--b--":
---[-- --
--
--

Practice:
---------
When writing a pretty-printing routine, follow the following simple
rules:

1) Boxes must be opened and closed consistently (open_* and close_box must
be nested like parens).
2) Never hesitate to open a box.
3) Output many break hints, otherwise the pretty-printer is in a bad
situation where it tries to do its best, which is always ``worse than
your bad''.
4) Don't try to force line breaking, let the pretty-printer do it for
you: that's its only job.
5) End your main program by a print_newline () call, that flushes the
pretty-printer tables (hence the ouput). (Note that the toplevel loop
of the interactive system does it as well b, just before a new input.)

Example:
--------
Let's give a full example: the shortest non trivial example you could
imagine, that is the $\lambda-$calculus :)

First, I give the abstract syntax of lambda-terms, then a lexical
analyzer and a parser for this language:

type lambda =
Lambda of string * lambda
| Var of string
| Apply of lambda * lambda;;

(* The lexer using the genlex module from standard library *)
#open "genlex";;
let lexer = make_lexer ["."; "\\"; "("; ")"];;

(* The syntax analyzer, using streams *)
let rec exp0 = function
[< 'Ident s >] -> Var s
| [< 'Kwd "("; lambda lam; 'Kwd ")" >] -> lam

and app = function
[< exp0 e; (other_applications e) lam >] -> lam

and other_applications f = function
[< exp0 arg; stream >] ->
other_applications (Apply (f, arg)) stream
| [<>] -> f

and lambda = function
[< 'Kwd "\\"; 'Ident s; 'Kwd "."; lambda lam >] ->
Lambda (s, lam)
| [< app e >] -> e;;

Let's try this parser with the interactive toplevel loop:

parse_lambda "(\x.x)";;
- : lambda = Lambda ("x", Var "x")

Now, I use the format library to print the lambda-terms: I follow the
recursive shape of the preceding parser to write the pretty-printer,
inserting here and there the desired break hints and opening (and
closing) boxes:

#open "format";;

let ident = print_string;;
let kwd = print_string;;

let rec print_exp0 = function
Var s -> ident s
| lam -> open_hovbox 1; kwd "("; print_lambda lam; kwd ")"; close_box ()

and print_app = function
e -> open_hovbox 2; print_other_applications e; close_box ()

and print_other_applications f =
match f with
Apply (f, arg) -> print_app f; print_space (); print_exp0 arg
| f -> print_exp0 f

and print_lambda = function
Lambda (s, lam) ->
open_hovbox 1;
kwd "\\"; ident s; kwd "."; print_space(); print_lambda lam;
close_box()
| e -> print_app e;;

Now we get:
print_lambda (parse_lambda "(\x.x)");;
\x. x- : unit = ()

(Note that parens are handled properly by the pretty-printer that
prints the minimum parens compatible with a proper input back to the parser.)
print_lambda (parse_lambda "(x y) z");;
x y z- : unit = ()

print_lambda (parse_lambda "x y z");;
x y z- : unit = ()

If you use this pretty-printer for debugging purpose using the
toplevel, declare it with ``install_printer'', such that the Caml
toplevel loop will use it to print values of type lambda:

install_printer "print_lambda";;
- : unit = ()

parse_lambda "(\x. (\y. x y))";;
- : lambda = \x. \y. x y

parse_lambda "((\x. (\y. x y)) (\z.z))";;
- : lambda = (\x. \y. x y) (\z. z)

This works very fine in conjonction with the trace facility of the
interactve system (well in fact I consider the definition of a
pretty-printer using format as mandatory to get a readable trace output):

trace"lambda";;
La fonction lambda est dorénavant tracée.
- : unit = ()

parse_lambda
"((\ident. (\autre_ident. ident autre_ident)) \
(\Truc.Truc Truc)) (\machin. (machin machin) machin)";;
lambda <-- <abstr>
lambda <-- <abstr>
lambda <-- <abstr>
lambda <-- <abstr>
lambda <-- <abstr>
lambda <-- <abstr>
lambda --> ident autre_ident
lambda --> \autre_ident. ident autre_ident
lambda --> \autre_ident. ident autre_ident
lambda --> \ident. \autre_ident. ident autre_ident
lambda <-- <abstr>
lambda <-- <abstr>
lambda --> Truc Truc
lambda --> \Truc. Truc Truc
lambda --> (\ident. \autre_ident. ident autre_ident) (\Truc. Truc Truc)
lambda <-- <abstr>
lambda <-- <abstr>
lambda <-- <abstr>
lambda --> machin machin
lambda --> machin machin machin
lambda --> \machin. machin machin machin
lambda --> (\ident. \autre_ident. ident autre_ident) (\Truc. Truc Truc)
(\machin. machin machin machin)
- : lambda =
(\ident. \autre_ident. ident autre_ident) (\Truc. Truc Truc)
(\machin. machin machin machin)

I strongly hope this helps!

Pierre Weis
----------------------------------------------------------------------------
WWW Home Page: http://pauillac.inria.fr/~weis
Projet Cristal
INRIA, BP 105, F-78153 Le Chesnay Cedex (France)
E-mail: Pierre.Weis@inria.fr
Telephone: +33 1 39 63 55 98
Fax: +33 1 39 63 53 30
----------------------------------------------------------------------------