Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
ANN: Generic print function
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2006-04-17 (02:09)
From: oleg@p...
Subject: ANN: Generic print function

The facility that prints results and types of expressions evaluated at
the top-level is now available anywhere in the program -- in bytecode-
or natively compiled programs. Generic printing is a (perhaps
unintentional) `side-effect' of MetaOCaml -- of the fact that a code
value is not merely AST; the code value also captures the type and the
type environment of variables and other values. Generic printing is a
library that works with the unmodified MetaOCaml (which is _fully_
compatible with the regular OCaml).


First of all, there is an, arguably small, matter of print_int,
print_char, etc.  Most of the time the typechecker knows exactly what
is the type of the data to print. Why should we spell it out in the
suffix of 'print' (or as a format specifier in Printf).

This small annoyance gets bigger if we deal with a complex data
structure, such as a list of records whose elements are variants and
arrays. There is no built-in print_xxx function for it: we _have to_
write our own. What is annoying is that OCaml knows darn well how to
print the structure, if the structure is the result of an expression
evaluated at top level. Alas, such printing is _not_ available in
standalone programs, or if we want to use the print function somewhere
in the code, where the structure is produced as an intermediate
result. Such a printing is a useful debugging aid. We may also want to
use the printing facility to write out the data structure, in a
human-readable form, into various files. The top-level output is quite
pretty and is useful beyond the top level.


The core function is
    val fprint : Format.formatter -> ('a,'b) code -> string 
which takes a code value of any type, and pretty-prints it on the
given formatter. The printed result is exactly the same as that by
the top-level value printing. The function [fprint] returns the
representation of the expression's type, as a string. The latter
is arguably a frill, but it was easy to do, so just as well.

For example,

    let pr_type et = Format.printf "\n%s@." et

    let () = 
      let x = Some ([|(10,true);(11,false)|]) in 
      pr_type (print .<x>.)

prints the following two lines:

       Some [|(10, true); (11, false)|]
       (int * bool) array option

The first line is the value, and the latter (printed by pr_type)
is the type. There was no need to define any custom printer for the
value or its components. A more involved examples is 65 lines down
in this message.


The included Makefile builds bytecode and native gprint libraries,
and runs the validation test -- at the top-level
(no need to compile any library), in a byte-code executable,
and a native code executable.

	The implementation depends on the unmodified MetaOCaml.  To
compile the library, MetaOCaml distribution is required. The
implementation is surprisingly simple and can be easily integrated
with MetaOCaml.


MetaOCaml lets us manipulate pieces of code as values. Whereas 1 is an
int value, .<1>. is a code value, of the type ('a,int) code. MetaOCaml
can print those code values:

     # let x = 1 in Trx.npc .<x>.;;
     # let x = 'a' in Trx.npc .<x>.;;
so far, so good. However,

     # let x = "a" in Trx.npc .<x>.;;
    .<(* cross-stage persistent value (as id: x) *)>.
And here we hit the snag. If we use a different printing function,
     # let x = "a" in Trx.printcode .<x>.;;
     expression ([0,0+-1]..[0,0+-1]) ghost
     Pexp_cspval <compiled_code> (as id: "x")

we see that the code value is internally an AST, Parsetree.expression.
We also see that aside from a few simple cases, MetaOCaml does not
inline the values from the captured variables; rather, MetaOCaml
incorporates references to such variables (so-called, cross-stage
persistent (CSP) variables).

We need the second observation: the code value is intended to be
evaluated (i.e., `run'). The compilation of a code value generally
requires its type. For example, to compile [match x with Foo -> ...]
we need to know the type of [x]. In particular, we need to know if
[Foo] is the only variant. If so, the above match is exhaustive and we
do not need to compile the default case: [_ -> raise Match_error].
Therefore, when MetaOCaml captures the reference to a CSP variable, it
has to, in general, capture the type as well. And it does, in a
special AST node, which contains the corresponding
Typedtree.expression. The latter includes the type and the
type environment with declarations, etc. These are
exactly the data that top-level's generic print function needs.

The common, and correct, reply to the frequently asked question as
to why OCaml does not have generic print is follows: printing a value
requires the knowledge of its type. Indeed, a machine integer '1' may
represent, inter alia, both an integer 0 and a boolean 'false'. The
type information is not preserved in the compiled code. Fortunately,
MetaOCaml's code values do preserve the necessary type information.


Let us first define the following complex data type:

module C = struct
  type 'a color = Blue | Green | Rgb of 'a

type 'a image = {title : string; pixels : 'a C.color array};;
type big = int image list;;

let v = [
  {title = "im1";
   pixels = [| C.Blue; C.Rgb 10 |]};
  {title = "im2";
   pixels = [| C.Green |]};

The following expression
    let () = pr_type (print .<v>.)
prints exactly what we expect.

Before continuing the example, we should note a drawback of the current
lack of integration of the generic print facility with MetaOCaml.  When
doing [print .<x>.] where x is of a variant type and its current value
is a constant constructor (e.g., None), we see the output 
'(* cross-stage persistent value (as id: x) *)'. This is a drawback of
some optimizations in MetaOCaml, and will be fixed if this code is
integrated into MetaOCaml. Fortunately, there is an easy workaround:
replace [print .<x>.] with [print (let z = [x] in .<z>.)].

We now continue the example:

open C
let some_processing ims =
  let brighten px =
      let new_px = match px with
	            Blue  -> Green
                  | Green -> Rgb 10
		  | Rgb x -> Rgb (x+1) in
      let () = Format.printf "@.pixel: %a -> %a @."
	       (fun ppf v -> ignore (fprint ppf v))
	       (let x = [px] in .<x>.)
	       (fun ppf v -> ignore (fprint ppf v))
	       (let x = [new_px] in .<x>.) in
      new_px in
  let process im =
    let () = Format.printf "Processing: " in
    let _  = print .<im>. in
    {im with pixels = brighten im.pixels} in
  let res = process ims in
  let _ = print .<res>. in
  Format.printf "@."

let () = some_processing v;;

The list of images, an image itself, and a single pixel were all
printed generically. We did not have to define any custom
printers. Here's the output:

Processing: {title = "im1"; pixels = [|Blue; Rgb 10|]}
pixel: [Blue] -> [Green] 

pixel: [Rgb 10] -> [Rgb 11] 
Processing: {title = "im2"; pixels = [|Green|]}
pixel: [Green] -> [Rgb 10] 
[{title = "im1"; pixels = [|Green; Rgb 11|]};
 {title = "im2"; pixels = [|Rgb 10|]}]


The function print is generic but not polymorphic. For example, if we
     let pr x = print .<x>.; x

and invoke the function as "pr [10]", we see the printed output
"<poly>". The function 'pr' has the type 'a->'a -- that is, it
promises to take the value of any type, regardless of its
structure. The function does not even need to know what is the exact
type of its argument, because it is irrelevant. Informally, an OCaml
function of the type ['a-> ...]  corresponds to the Haskell function
[a -> ...]. OTH, an OCaml function of the type [('a,'b) code -> ...]
corresponds to Haskell's [Typeable b => b -> ...]. The latter
enables generic programming.