Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Format: invert breakable and non-breakable spaces #6988

Closed
vicuna opened this issue Sep 11, 2015 · 5 comments
Closed

Format: invert breakable and non-breakable spaces #6988

vicuna opened this issue Sep 11, 2015 · 5 comments

Comments

@vicuna
Copy link

vicuna commented Sep 11, 2015

Original bug ID: 6988
Reporter: dhekir
Assigned to: @alainfrisch
Status: resolved (set by @alainfrisch on 2016-12-08T11:48:35Z)
Resolution: not a bug
Priority: normal
Severity: feature
Version: 4.02.3
Category: standard library
Monitored by: @hcarty

Bug description

In OCaml's Format module, if I understood it correctly, every space is by default non-breakable, and every breakable space needs to be written as "@ ".

This behavior is contrary to every other formatting system I know of, be it LaTeX, HTML (&nbsp), or WYSIWYG text editors (e.g. in Word, non-breakable spaces need to be inserted using Space plus other keys). The default behavior and logics seem to be that, either you don't break lines (as in C), so that grep can work reliably, and you leave formatting to other tools, or you follow the standard typographical rules that every space is breakable by default, unless indicated otherwise (e.g. the most common case prevails).

One practical consequence of the current behavior is that, whenever someone tries to write a formatting string that will break "naturally", it ends up being@ extremely@ hard@ to@ read@ and@ impossible@ to@ grep.

Therefore, people end up not using this fantastic formatting feature, or using it incorrectly. Or trying to wrap each string in a function which does the conversion manually, etc...

I would propose adding a formatter flag that would invert the behavior (using "@ " for non-breakable spaces), and then sometime in the future invert the default behavior, deprecating the current one.

Is there a technical reason that prevents this from working? Or is there a design reason why breakable spaces should not be the default?

@vicuna
Copy link
Author

vicuna commented Sep 11, 2015

Comment author: @gasche

Note that Daniel Bünzli contributed the

Format.pp_print_text : formatter -> string -> unit

function, which is available since 4.02 and interprets any space in the given string as breakable space, and any line break as a forced newline.

This may be the function you want to use.

@vicuna
Copy link
Author

vicuna commented Sep 11, 2015

Comment author: @lpw25

Or is there a design reason why breakable spaces should not be the default?

I suspect the reason is that Format was probably designed for formatting source code rather than human text.

@vicuna
Copy link
Author

vicuna commented Sep 14, 2015

Comment author: dhekir

Excellent, thank you for the explanation as well, I had never considered source code printing as major motivator.

May I suggest adding some remarks to https://ocaml.org/learn/tutorials/format.html and http://caml.inria.fr/pub/docs/manual-ocaml/libref/Format.html concerning those points, or specifically mentioning pp_print_text more prominently ? It might be useful for future newcomers. Something like: "if you wish to simply print text with automatic line breaking, consider using pp_print_text".

As a minor aside, I do not see if pp_print_text allows the introduction of non-breakable spaces as in my original suggestion (if so, maybe it could be suggested in the function description), but it is indeed a very useful function!

@vicuna
Copy link
Author

vicuna commented Sep 14, 2015

Comment author: @gasche

If you were willing to write such a remark yourself, this can be done by editing the following pages:

https://github.com/ocaml/ocaml.org/blob/master/site/learn/tutorials/format.md
https://github.com/ocaml/ocaml/blob/trunk/stdlib/format.mli

pp_print_text does not have specific support for non-breakable space (but you can use one of the unicode non-breakable spaces, such as U+00A0). I think simplicity is an important feature of the function, and I'm not satisfied with the idea of using "@ " as non-breakable-space syntax in some contexts, it would be very confusing.

@vicuna
Copy link
Author

vicuna commented Sep 14, 2015

Comment author: dhekir

I hadn't realized that the entire website was tracked on git, I'll try to make some recommendations and submit a merge request.

For non-breaking space, indeed U+00A0 (Compose + Space in my Linux) works nicely, it is even printed differently in Emacs, which helps seeing it.

Ideally there would also be a tprintf function (for text-printf) or something similar, so if I ever have the need and time, I'll try submitting it as merge request, now that I've better understood the philosophy behind Format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants