The pretty-printing facility provided by the format module is used to get a fancy display for printing routines. This module provides a ``pretty-printing engine'' that is intended to break lines in a nice way (let's say ``automatically when it is necessary'').
Table of contents:
printf
function
Breaking of lines is based on 2 concepts:
There are 4 types of boxes. (The most often used is the ``hov'' box type, so skip the rest at first reading).
open_box
procedure.
Let me give an example. Suppose we can write 10 chars before the right
margin (that indicates no more room). We represent any char as -
,
[
and ]
indicates opening and closing of
box and b
stands for a break hint found in the output by
the pretty-printing engine.
The output "--b--b--" is displayed like this (the b symbol stands for the value of the break that is explained below):
--b--b--
--b --b --
If there is enough room to print the box on the line:
--b--b--
But "---b---b---" that cannot fit on the line is written
---b ---b ---
If there is enough room to print the box on the line:
--b--b--
But if "---b---b---" cannot fit on the line, it is written as
---b---b ---
The first break does not lead to a new line, since there is enough room on the line. The second one leads to a new line since there is no more room to print the material following it. If the room left on the line were even shorter, the first break hint may lead to a new line and "---b---b---" is written as:
---b ---b ---
Break hints are also used to output spaces (if the line is not split
when the break is encountered, otherwise the new line indicates
properly the separation between printing items).
You output a break hint using print_break sp indent
, and
this sp
integer is used to print ``sp'' spaces. Thus
print_break sp ...
may be thought as: print
sp
spaces or output a new line.
For instance, if b is break 1 0
in the output
"--b--b--", we get
-- -- --
-- -- --
-- -- --
or (according to the remaining room on the line)
-- -- --
Generally speaking, a printing routine using "format", should not
directly output white spaces: the routine should use break hints instead.
(For instance print_space ()
that is a convenient abbrev for
print_break 1 0
and outputs a single space or break the line.)
The user gets 2 ways to fix the indentation of new lines:
open_hovbox 1
opens a ``hov'' box with new lines
indented 1 more than the initial indentation of the box. ---[--b--b --b--with
open_hovbox 2
, we get
---[--b--b --b--Note: the
[
sign in the display is actually not visible on the
screen, it is just there to materialize the aperture of the
pretty-printing box. Last ``screen'' stands for:
-----b--b --b--
print_break sp indent
. The
indent
integer is used to fix the indentation of the new
line. Namely it is added to the default indentation offset of the box
where the break occurs. [
stands for the opening of a ``hov'' box
with 1 as extra indentation (as
obtained by open_hovbox 1
), and b is print_break 1
2
, then from output "---[--b--b--b--", we get:
---[-- -- -- --
print_cut
, print_as
,
force_newline
, print_flush
,printf
.
The ``hov'' box type is refined into two categories.
The difference between a packing and a structural ``hov'' box is shown
by a routine that closes boxes and parens at the end of printing:
with packing boxes, the closure of boxes and parens do not lead to new
lines if there is enough room on the line, whereas with structural
boxes each break hint will lead to a new line. For instance, when
printing "[(---[(----[(---b)]b)]b)]", where "b" is a break hint
without extra indentation (print_cut ()
).
If "[" means opening of a packing ``hov'' box (open_hovbox),
"[(---[(----[(---b)]b)]b)]" is printed as follows:
(--- (---- (---)))If we replace the packing boxes by structural boxes (open_box), each break hint that precedes a closing paren can show the boxes structure, if it leads to a new line; hence "[(---[(----[(---b)]b)]b)]" is printed like this:
(--- (---- (--- ) ) )
When writing a pretty-printing routine, follow these simple rules:
open_*
and close_box
must be nested like parens).
print_space ()
) unless you explicitly don't want the
line to be broken here. For instance, imagine you want to pretty print
a definition in a Caml like language, for instance let rec ident
= expression
. You will probably treat the first three spaces as
``unbreakable spaces'' and write them directly in the string constants
for keywords, "let rec "
before the identifier and
" ="
after it. However, the space preceding the
expression will certainly be a break hint, since breaking the line
after the =
Don't try to force line breaking, let the pretty-printer do it
for you: that's its only job.
print_newline ()
call,
that flushes the pretty-printer tables (hence the output). (Note that
the toplevel loop of the interactive system does it as well, just
before a new input.)
printf
functionThe format
module provides a general printing facility
``à la'' printf
. In addition to the usual conversion
facility provided by printf
, you can write
pretty-printing indications directly into the string format (opening
and closing boxes, indicating breaking hints, etc).
Pretty-printing annotations are introduced by the @
symbol, directly into the string format. Almost any function of the
format
module can be called from within a
printf
format. For instance
@[
'' open a box (open_box 0
). You may
precise the type as an extra argument. For instance @[<hov
n>
is equivalent to
open_hovbox n
.
@]
'' close the last open box (close_box ()
).
@
'' output a breakable space (print_space ()
).
@,
'' output a simple break hint (print_cut
()
).
@.
'' end the pretty-printing, closing all the boxes
still opened (print_newline ()
).
@;<n m>
'' emit a ``full'' break hint
(print_break n m
). If the ``<n m>
''
part is omitted the full break defaults to a simple break hint (@,
).
@?
'' output pending material in the pretty-printer queue (print_flush ()
).
printf "@[<1>%s@ =@ %d@ %s@]@." "Prix TTC" 100 "Euros";; Prix TTC = 100 Euros - : unit = ()
A more realistic example is given below.
Let me give a full example: the shortest non trivial example you could imagine, that is the $\lambda-$calculus :)
Thus the problem is to pretty-print the values of a concrete data type that implements a model of a language of expressions that defines functions and their applications to arguments.
First, I give the abstract syntax of lambda-terms, then a lexical analyzer and a parser for this language:
type lambda = | Lambda of string * lambda | Var of string | Apply of lambda * lambda;; (* The lexer using the genlex module from standard library *) #open "genlex";; let lexer = make_lexer ["."; "\\"; "("; ")"];; (* The syntax analyzer, using streams *) let rec exp0 = function | [< 'Ident s >] -> Var s | [< 'Kwd "("; lambda lam; 'Kwd ")" >] -> lam and app = function | [< exp0 e; (other_applications e) lam >] -> lam and other_applications f = function | [< exp0 arg; stream >] -> other_applications (Apply (f, arg)) stream | [<>] -> f and lambda = function | [< 'Kwd "\\"; 'Ident s; 'Kwd "."; lambda lam >] -> Lambda (s, lam) | [< app e >] -> e;;
Let's try this parser with the interactive toplevel loop:
#let parse_lambda s = lambda (lexer (stream_of_string s));; parse_lambda : string -> lambda = <fun> #parse_lambda "(\x.x)";; - : lambda = Lambda ("x", Var "x")
Now, I use the format library to print the lambda-terms: I follow the recursive shape of the preceding parser to write the pretty-printer, inserting here and there the desired break hints and opening (and closing) boxes:
#open "format";; let ident = print_string;; let kwd = print_string;; let rec print_exp0 = function | Var s -> ident s | lam -> open_hovbox 1; kwd "("; print_lambda lam; kwd ")"; close_box () and print_app = function | e -> open_hovbox 2; print_other_applications e; close_box () and print_other_applications f = match f with | Apply (f, arg) -> print_app f; print_space (); print_exp0 arg | f -> print_exp0 f and print_lambda = function | Lambda (s, lam) -> open_hovbox 1; kwd "\\"; ident s; kwd "."; print_space(); print_lambda lam; close_box() | e -> print_app e;;
Now we get:
print_lambda (parse_lambda "(\x.x)");; \x. x- : unit = ()
(Note that parens are handled properly by the pretty-printer that prints the minimum number of parens compatible with a proper input back to the parser.)
print_lambda (parse_lambda "(x y) z");; x y z- : unit = () print_lambda (parse_lambda "x y z");; x y z- : unit = ()
If you use this pretty-printer for debugging purpose with the
toplevel, declare it with install_printer
, such that the Caml
toplevel loop will use it to print values of type lambda:
install_printer "print_lambda";; - : unit = () parse_lambda "(\x. (\y. x y))";; - : lambda = \x. \y. x y parse_lambda "((\x. (\y. x y)) (\z.z))";; - : lambda = (\x. \y. x y) (\z. z)
This works very fine in conjunction with the trace facility of the interactive system (well in fact, as soon as values manipulated by the programs are a bit complex, I consider the definition of a pretty-printer using format as mandatory to get a readable trace output):
trace"lambda";; La fonction lambda est dorénavant tracée. - : unit = () parse_lambda "((\ident. (\other_ident. ident other_ident)) \ (\Bar.Bar Bar)) (\foobar. (foobar foobar) foobar)";; lambda <-- <abstr> lambda <-- <abstr> lambda <-- <abstr> lambda <-- <abstr> lambda <-- <abstr> lambda <-- <abstr> lambda --> ident other_ident lambda --> \other_ident. ident other_ident lambda --> \other_ident. ident other_ident lambda --> \ident. \other_ident. ident other_ident lambda <-- <abstr> lambda <-- <abstr> lambda --> Bar Bar lambda --> \Bar. Bar Bar lambda --> (\ident. \other_ident. ident other_ident) (\Bar. Bar Bar) lambda <-- <abstr> lambda <-- <abstr> lambda <-- <abstr> lambda --> foobar foobar lambda --> foobar foobar foobar lambda --> \foobar. foobar foobar foobar lambda --> (\ident. \other_ident. ident other_ident) (\Bar. Bar Bar) (\foobar. foobar foobar foobar) - : lambda = (\ident. \other_ident. ident other_ident) (\Bar. Bar Bar) (\foobar. foobar foobar foobar)
printf
functionWe use the fprintf
function and the pretty-printing
functions get an extra argument, namely a pretty-printing formatter
(the ppf
argument) where printing will occur. This way
the printing routines are a bit more general, since they may print on
any formatter defined in the program, and furthermore they may be used
in conjunction with the special %a
format, that prints a
printf
argument with a supplied user's defined function
(these user's functions must have a formatter as first argument).
For instance
fprintf ppf "(%a)" pr_lambda lamprints the
lam
argument, using the
pr_lambda
function (and we must have pr_lambda :
formatter -> lambda -> unit
).
Using printf
formats, the
lambda-terms printing routines can be written as follows:
#open "format";; let ident ppf s = fprintf ppf "%s" s;; let kwd ppf s = fprintf ppf "%s" s;; let rec pr_exp0 ppf = function | Var s -> ident ppf s | lam -> fprintf ppf "@[<1>(%a)@]" pr_lambda lam and pr_app ppf = function | e -> fprintf ppf "@[<2>%a@]" pr_other_applications e and pr_other_applications ppf f = match f with | Apply (f, arg) -> fprintf ppf "%a@ %a" pr_app f pr_exp0 arg | f -> pr_exp0 ppf f and pr_lambda ppf = function | Lambda (s, lam) -> fprintf ppf "@[<1>%a%a%a@ %a@]" kwd "\\" ident s kwd "." pr_lambda lam | e -> pr_app ppf e;; let print_lambda = pr_lambda std_formatter;;We get:
print_lambda (parse_lambda "(\x.x)");; \x. x- : unit = ()
Contact the author Pierre.Weis@inria.fr