New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Format: invert breakable and non-breakable spaces #6988
Comments
Comment author: @gasche Note that Daniel Bünzli contributed the Format.pp_print_text : formatter -> string -> unit function, which is available since 4.02 and interprets any space in the given string as breakable space, and any line break as a forced newline. This may be the function you want to use. |
Comment author: @lpw25
I suspect the reason is that Format was probably designed for formatting source code rather than human text. |
Comment author: dhekir Excellent, thank you for the explanation as well, I had never considered source code printing as major motivator. May I suggest adding some remarks to https://ocaml.org/learn/tutorials/format.html and http://caml.inria.fr/pub/docs/manual-ocaml/libref/Format.html concerning those points, or specifically mentioning pp_print_text more prominently ? It might be useful for future newcomers. Something like: "if you wish to simply print text with automatic line breaking, consider using pp_print_text". As a minor aside, I do not see if pp_print_text allows the introduction of non-breakable spaces as in my original suggestion (if so, maybe it could be suggested in the function description), but it is indeed a very useful function! |
Comment author: @gasche If you were willing to write such a remark yourself, this can be done by editing the following pages: https://github.com/ocaml/ocaml.org/blob/master/site/learn/tutorials/format.md pp_print_text does not have specific support for non-breakable space (but you can use one of the unicode non-breakable spaces, such as U+00A0). I think simplicity is an important feature of the function, and I'm not satisfied with the idea of using "@ " as non-breakable-space syntax in some contexts, it would be very confusing. |
Comment author: dhekir I hadn't realized that the entire website was tracked on git, I'll try to make some recommendations and submit a merge request. For non-breaking space, indeed U+00A0 (Compose + Space in my Linux) works nicely, it is even printed differently in Emacs, which helps seeing it. Ideally there would also be a tprintf function (for text-printf) or something similar, so if I ever have the need and time, I'll try submitting it as merge request, now that I've better understood the philosophy behind Format. |
Original bug ID: 6988
Reporter: dhekir
Assigned to: @alainfrisch
Status: resolved (set by @alainfrisch on 2016-12-08T11:48:35Z)
Resolution: not a bug
Priority: normal
Severity: feature
Version: 4.02.3
Category: standard library
Monitored by: @hcarty
Bug description
In OCaml's Format module, if I understood it correctly, every space is by default non-breakable, and every breakable space needs to be written as "@ ".
This behavior is contrary to every other formatting system I know of, be it LaTeX, HTML ( ), or WYSIWYG text editors (e.g. in Word, non-breakable spaces need to be inserted using Space plus other keys). The default behavior and logics seem to be that, either you don't break lines (as in C), so that grep can work reliably, and you leave formatting to other tools, or you follow the standard typographical rules that every space is breakable by default, unless indicated otherwise (e.g. the most common case prevails).
One practical consequence of the current behavior is that, whenever someone tries to write a formatting string that will break "naturally", it ends up being@ extremely@ hard@ to@ read@ and@ impossible@ to@ grep.
Therefore, people end up not using this fantastic formatting feature, or using it incorrectly. Or trying to wrap each string in a function which does the conversion manually, etc...
I would propose adding a formatter flag that would invert the behavior (using "@ " for non-breakable spaces), and then sometime in the future invert the default behavior, deprecating the current one.
Is there a technical reason that prevents this from working? Or is there a design reason why breakable spaces should not be the default?
The text was updated successfully, but these errors were encountered: