Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0006452OCamlOCaml standard librarypublic2014-06-05 16:032015-10-07 17:23
Assigned To 
PrioritynormalSeverityfeatureReproducibilityhave not tried
PlatformOSOS Version
Product Version 
Target Version4.03.0+devFixed in Version 
Summary0006452: custom format
DescriptionIt would be good if we could put custom functions into a format. There doesn't need to be a syntax for it, this could be the job of a ppx.

I'm thinking of something like this:

  | Custom :
      ('b -> 'x -> 'c) * ('a, 'b, 'c, 'd, 'e, 'f) fmt ->
        ('x -> 'a, 'b, 'c, 'd, 'e, 'f) fmt

This would allow to write things like in a much cleaner way. At the moment we modify the call site to stuff the conversion functions in the list of arguments: it's not pretty and becoming more and more complex. With the suggested modification we could simply 'map' the format and stuff the conversion functions in it.

I don't have a use case but it might also be interesting to allow custom functions to take 0 or more than 1 argument, here is a more generic proposal:

  | Custom :
     ('x, 'y, 'a, 'c) custom_arity * ('b -> 'y) * ('a, 'b, 'c, 'd, 'e, 'f) fmt ->
       ('x, 'b, 'c, 'd, 'e, 'f) fmt

  type (_, _, _, _) custom_arity =
    | Zero : ('a, 'b, 'a, 'b) custom
    | Succ : ('a, 'b, 'c, 'd) custom -> ('x -> 'a, 'x -> 'b, 'c, 'd) custom
TagsNo tags attached.
Attached Filesdiff file icon custom-format-string-only.diff [^] (4,043 bytes) 2015-03-10 13:20 [Show Content]
diff file icon patch-ocaml-trunk-improve-Custom-printing-scanning.diff [^] (128,479 bytes) 2015-05-22 02:38 [Show Content]

- Relationships

-  Notes
gasche (developer)
2015-03-08 10:43

The patch proposed in [^]
was just merged in trunk. Thanks.
dim (developer)
2015-03-10 13:23

Added a patch to make custom functions always return a string. To avoid pinning the format to one specific class of printf functions.
dim (developer)
2015-05-18 11:11

Merged in 4.02 (revisions 16124 and 16125).
gasche (developer)
2015-05-18 11:12

Thanks for taking care of the 4.02 merge.
bvaugon (developer)
2015-05-22 02:39

I submitted a new version that complete the custom format
implementation of Jeremie by adding support for formatting and
scanning usages.

In this version, instead of computing a string, a custom node takes a
printer that writes data in a Format.formatter. It is usefull for
streaming data rather than constructing a string in-memory, and for
generating formatting annotations when targeting functions.

In addition, I suggest that a custom node now takes a fourth argument,
a scanner, that is used by functions to scan arbitrary data
and give them to the callback function.

The printers and scanners may be arbitrary functions that manually
manipulate the in/out data-flows. However, for simple use-cases, a
custom node may be simply constructed as follows, by forwarding
printing to Format.fprintf and scanning to Scanf.bscanf:

  let arity = Custom_succ (Custom_succ Custom_zero)

  let printer mode fmt n b = match mode with
    | Called_from_Printf -> Format.fprintf fmt "[%X/%B]" n b
    | Called_from_Format -> Format.fprintf fmt "@[[%X/@,%B]@]" n b

  let scanner ib f = Scanf.bscanf ib "[%X/%B]" f

  let fmt = Custom (arity, printer, scanner, End_of_format)

The first argument, "arity", is an integer in a à la Peano
representation from range [ 0; +inf [ (or from range [ 0; +inf ] using
-rectypes (ok, forget it ^^)). It defines the number of values
expected by the printer and the number of values scanned by the
scanner and given to the callback function.

The second argument is the printer, it is called with a mode, a
Format.formatter and [arity] arguments; only when all arguments have
been passed to the printing function. It is then robust to partial
calls of printing functions. The [mode] is usefull to provide printers
that change their behaviour depending on the calling context. Such a
printer should provide formatting annotations when it is called by a
Format.[x]printf function, and not when it is called by a
Printf.[x]printf function.

The scanning function takes as arguments a Scanf.Scanning.in_channel
and a callback function [f]. After scanning (or computing) the [arity]
values from the in_channel, the scanner have to call [f] with the
[arity] values as argument and to return the value returned by [f].

The last argument is the rest of the format.

As the Jeremie one, this implementation is of course type safe (no

This patch however has some drawbacks :
 - it is much bigger, in particular because I needed to move
   some type declarations and printing primitives from Format and Scanf
   modules to CamlinternalFormat and CamlinternalFormatBasics modules.
 - for basic custom printers, writting data through a formatter is a bit
   slower than constructing a string and directly pushing it
   in the Buffer.t or the output channel as Jeremie does.
gasche (developer)
2015-05-22 21:47
edited on: 2015-05-22 21:47

(As we already discussed together,) I am unconvinced by the idea of adding a Format dependency to the format-string logic and implementation. I think that Printf and Scanf are reasonably general in their interface, but there are many other pretty-printing libraries than Format in the OCaml world that make different assumptions.

(Of course the dependency is already there to some degree, in the format of the dedicated formatting converters in format strings (eg. "[<h 3>...]"), but I think a hard dependency on the internal type of Format buffers is qualitatively different.)

(One possibility would be to remove the Format support of this patch, but keep a mode argument with only one constructor calling_from_Printf, keeping the door open for other modes in the future.)

Jérémie, would you be interested in either the Format or the Scanf support?

I'm reluctant to consider a change as invasive as the proposed patch just weeks before a minor release. Is there a strong consensus among the people that care about format strings?

bvaugon (developer)
2015-05-22 22:40

Yes, but remarks that having the support of Format features in Custom
nodes does not impose to use them and "leave the door opened" to use
any extern custom formatting functions. It is just a functionnality,
that allow programmers to use uniformly the different printing and
scanning functions offered by the OCaml stdlib.

Moreover, there is no new dependencies to the Format module nor
between stdlib modules. Furthermore, before this patch, there were
already dependencies from CamlinternalFormatBasics to format-specific
constructions before this patch (I think about the types "block_type",
"formatting_lit", and "formatting_gen"). Instead, this patch gathers
type declarations that were previously spread in multiple modules.

But you are right, this patch is huge (not really because it consits
in deep modifications, but because it moves large units of code and
type declarations), and it might be normal to take more time to
deliberate about it if the OCaml release process impose it.
gasche (developer)
2015-05-24 10:44

I think that there is something fundamentally problematic with importing a complete pretty-printing library logic into a module that currently only handle formatting strings.

There are some weird things with the patch as proposed, for example:
- the From_file constructor for Scanf scanning buffers apparently changed his signature (why?)
- some code that used a Buffer before now uses a Bytes with home-duplicated regrowing logic
- a lot of code moves from buffer_foo to; this makes the patch harder to review for no good reason, I think (but it would be fine as a separate patch, along with some justification of why this part of the code may or may not depend on Buffer)
bvaugon (developer)
2015-05-24 11:12

The reason is that the Scanning.in_channel has been moved in
CamlinternalFormatBasics, which is loaded before Pervasives, it can
now no longer use the OCaml stdlib.

The old "in_channel_name" (declaring the constructor From_file) made
multiple references to the type Pervasives.in_channel (that was also
not very logic) which were only used to close the channel. So, I
transformed it in a proper generic function "close" stored in a field
of the Scanning.in_channel record. This modification removed the same
time the dependance to Pervasives.in_channel.

A Buffer.t was also mentionned in the type Scanning.in_channel and for
the same reason, I suppressed the reference to Buffer.t and changed it
to a two fields (bytes, int). Independently, when I changed this
reference to Buffer.t, I saw a partial implementation of buffers in
the CamlinternalFormat which had nothing to do there because the
module CamlinternalFormat, contrary to CamlinternalFormatBasics,
already depends on the module Buffer. That's why I cleaned this code
by removing the partial implementation of buffers from
CamlinternalFormat and replace it with the standard Buffer module.

Since these modifications only concern internals of the stdlib (I
meen, is hidden behind abstract types or masked inside camlinternalXXX
modules), it does'nt break compatibility.
dim (developer)
2015-05-25 10:49

> Jérémie, would you be interested in either the Format or the Scanf support?

If Format was supported, for the !"%{sexp:type}" syntax custom_printf would use [Sexp.pp] when called from Format. Same for Scanf, !"{sexp:type}" could allow to parse a S-expression.

But I don't think there is a high demand for it, at Jane Street for instance we use custom_printf a lot more than Format or Scanf.
gasche (developer)
2015-05-25 10:55

I'm thinking of not trying to rush this patch into 4.02, but eventually merge the feature in 4.03. That would mean that the printf-relevant interface would be different between the two major versions. Is that ok for you?
dim (developer)
2015-05-25 11:08

That's OK for me.
gasche (developer)
2015-06-30 12:18

Just a note to whoever may wonder what happen{s,ed} to this patch. I am still interesting in merging improved Custom_format in trunk, but I am still a bit worried by the bulkiness of Benoît's patch and in particular the import of a large quantity of pretty-printing logic deeper in the OCaml distribution's entrails. I would be interested in knowing what the Scanf-relevant part of this patch looks like, but I personally lack the time to play with the current patch in this way.

If someone wished to move this forward, I think a second opinion on "can this proposal be made less invasive?", or maybe even "is there a better compromise in term of expressiveness / invasiveness ratio?" would be helpful.
drup (reporter)
2015-10-06 20:19

I don't have much opinion on the presumed "bulkiness" of the patch (except that moving code around with barely changing it, in a backward compatible manner, is not my definition of "invasive"), however I do have an opinion on the feature.

As I mentioned in ppx_custom_printf's bug tracker[1], composing with `to_string` function is wrong, and not in line with the general spirit of composable pretty printers that is presented in the `Format` module.

Maybe janestreet doesn't care about proper indentation of pretty printed output, but I do, and as much as I'm interested by ppx_custom_printf, It's unusable for me at the moment due to this limitation.

[1]: [^]
dim (developer)
2015-10-07 17:23

Note: since the patch moves a lot of code around, it'd be nice to provide diffing commands, such as:

  patdiff <(git show <some-hash>:stdlib/ | sed -n 12,42p) stdlib/camlinternalFormat.mli

It will make it easier for people to review the changes (once the patch has been rebased to the current trunk and submitted as a pull request on github).

- Issue History
Date Modified Username Field Change
2014-06-05 16:03 dim New Issue
2014-06-19 17:53 gasche Status new => acknowledged
2014-06-19 17:54 gasche Tag Attached: junior_job
2014-09-04 00:25 doligez Target Version 4.02.1+dev => undecided
2015-03-08 10:43 gasche Note Added: 0013408
2015-03-08 10:43 gasche Status acknowledged => resolved
2015-03-08 10:43 gasche Fixed in Version => 4.03.0+dev
2015-03-08 10:43 gasche Resolution open => fixed
2015-03-08 10:43 gasche Assigned To => dim
2015-03-10 13:20 dim File Added: custom-format-string-only.diff
2015-03-10 13:23 dim Note Added: 0013424
2015-05-06 21:13 gasche Assigned To dim => gasche
2015-05-06 21:13 gasche Status resolved => confirmed
2015-05-06 21:58 gasche Target Version undecided => 4.02.2+dev / +rc1
2015-05-07 18:07 doligez Product Version later =>
2015-05-08 23:21 doligez Target Version 4.02.2+dev / +rc1 => 4.03.0+dev
2015-05-11 18:01 doligez Target Version 4.03.0+dev => 4.02.2+dev / +rc1
2015-05-12 19:34 doligez Status confirmed => assigned
2015-05-18 11:11 dim Note Added: 0013945
2015-05-18 11:12 dim Status assigned => resolved
2015-05-18 11:12 dim Fixed in Version 4.03.0+dev => 4.02.2+dev / +rc1
2015-05-18 11:12 gasche Note Added: 0013946
2015-05-22 02:38 bvaugon File Added: patch-ocaml-trunk-improve-Custom-printing-scanning.diff
2015-05-22 02:39 bvaugon Note Added: 0013955
2015-05-22 21:47 gasche Note Added: 0013956
2015-05-22 21:47 gasche Note Edited: 0013956 View Revisions
2015-05-22 22:40 bvaugon Note Added: 0013957
2015-05-24 10:25 gasche Assigned To gasche =>
2015-05-24 10:25 gasche Status resolved => feedback
2015-05-24 10:25 gasche Resolution fixed => reopened
2015-05-24 10:44 gasche Note Added: 0013960
2015-05-24 11:12 bvaugon Note Added: 0013961
2015-05-25 10:49 dim Note Added: 0013964
2015-05-25 10:49 dim Status feedback => new
2015-05-25 10:55 gasche Note Added: 0013965
2015-05-25 11:08 dim Note Added: 0013967
2015-05-25 11:13 gasche Status new => confirmed
2015-05-25 11:13 gasche Fixed in Version 4.02.2+dev / +rc1 =>
2015-05-25 11:13 gasche Target Version 4.02.2+dev / +rc1 => 4.03.0+dev
2015-05-30 10:57 gasche Tag Detached: junior_job
2015-06-30 12:18 gasche Note Added: 0014178
2015-10-06 20:19 drup Note Added: 0014493
2015-10-07 17:23 dim Note Added: 0014494

Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker