Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0006732OCamlstandard librarypublic2014-12-22 22:572017-03-15 14:16
Reporterdoligez 
Assigned To 
PrioritylowSeverityminorReproducibilityalways
StatusacknowledgedResolutionopen 
PlatformOSOS Version
Product Version4.02.1 
Target VersionFixed in Version 
Summary0006732: Buffer.add_substitute is not surjective and its documentation is incomplete.
DescriptionThe documentation of Buffer.add_substitute says:
   An escaped [$] character is a [$] that immediately follows a backslash
   character; it then stands for a plain [$].
This is not true if the [$] follows an even number of backslashes.

It also says:
   Raise [Not_found] if the closing character of a parenthesized variable
   cannot be found.
But [Not_found] is also be raised if there is an unescaped [$] at the end of the string.

The documentation also fails to state that, when [Not_found] is raised, some characters are still added to the buffer. It also omits the fact that the adding stops silently when it sees an unescaped [$] followed by anything else than alphanum, '_', '{' or '('.

Finally, the only way to get a single backslash followed by a dollar is by playing with variables, there is no way to get it from a constant string because the quoting conventions are incomplete: backslashes should also be escaped by backslashes.

Some of these problems should be fixed, and the others documented.
Steps To Reproduce        OCaml version 4.02.1

# let b = Buffer.create 0;;
val b : Buffer.t = <abstr>
# Buffer.add_substitute b (fun _ -> "") "\n1. \\$";;
- : unit = ()
# Buffer.add_substitute b (fun _ -> "") "\n2. \\\\$";;
Exception: Not_found.
# Buffer.add_substitute b (fun _ -> "") "\n3. \\\\$ ";;
- : unit = ()
# Buffer.add_substitute b (fun _ -> "") "\n3. \\\\$!";;
- : unit = ()
# Buffer.add_substitute b (fun _ -> "") "\n4. \\\\\\$";;
- : unit = ()
# Buffer.add_substitute b (fun _ -> "") "\n5. $}";;
- : unit = ()
# print_endline (Buffer.contents b);;

1. $
2. \\
3. \\
3. \\
4. \\$
5.
- : unit = ()
#
TagsNo tags attached.
Attached Files

- Relationships

-  Notes
(0012960)
frisch (developer)
2014-12-23 12:13

My hope was we could instead get rid of this ad hoc function, which tastes very non-stdlib-y and accounts for more than half of buffer.ml lines. But to my great surprise people actually seem to use Buffer.add_substitute (e..g in Cmdliner or in Bisect) :-( Btw, Cmdliner's use is rather amusing, with a double call to expand $(i,$(mname)). If we are to fix the function, shouldn't we treat this directly? And why should this be tied to Buffer? People are likely to create a fresh buffer just for calling add_substitute and immediately extract its content.
(0012961)
dbuenzli (reporter)
2014-12-23 13:49

I was indeed particulary lazy on that one. OTOH I'm glad that this code is not in the already too long Cmdliner module...
(0012976)
doligez (administrator)
2014-12-23 18:00

My secret hope was also to get rid of it, but...

AFAICT, it's in Buffer because it's going to use a buffer internally, so it must depend on Buffer and cannot be in String.

@dbuenzli, how bad do you think it would it be if we changed the quoting conventions to something sane but not backward-compatible? I'm thinking of getting rid of the broken backslash-escape thing and just replace "$$" with "$", so there would be no special character besides '$'.
(0012979)
dbuenzli (reporter)
2014-12-23 18:17

Cmdliner wouldn't be affected I think. The deranged use I make of this feature is to use variables of the form $(i,message) and $(b,message) to mark text as being italicized or bold. This comes from the fact that variables can be anything if they are enclosed by parens; from the docs:

an arbitrary sequence of characters enclosed by a pair of matching parentheses or curly brackets.

Another twist is that if the above message strings may contain variables themselves, which are then resolved a second time. Having $ in this is allowed per the above def. So if you make sure you are not touching the definition of variables I don't think Cmdliner would be affected (and would welcome the saner quoting conventions).
(0013043)
doligez (administrator)
2015-01-09 18:22

I'm also thinking of getting rid of the exception case altogether: if $ is not followed by a variable or another $, just call the substitution function with an empty string, and let it choose whether to treat it as an error or not.

That would make the code simpler and more powerful.
(0013061)
dbuenzli (reporter)
2015-01-10 13:45

I don't see any problem with that in my code.
(0017673)
dbuenzli (reporter)
2017-03-15 14:16

Note that `Cmdliner` no longer uses `Buffer.add_substitute` at all.

- Issue History
Date Modified Username Field Change
2014-12-22 22:57 doligez New Issue
2014-12-23 12:13 frisch Note Added: 0012960
2014-12-23 13:49 dbuenzli Note Added: 0012961
2014-12-23 18:00 doligez Note Added: 0012976
2014-12-23 18:17 dbuenzli Note Added: 0012979
2015-01-09 18:22 doligez Note Added: 0013043
2015-01-09 18:22 doligez Target Version => 4.02.2+dev / +rc1
2015-01-09 18:22 doligez Status new => acknowledged
2015-01-10 13:45 dbuenzli Note Added: 0013061
2015-03-18 17:54 frisch Target Version 4.02.2+dev / +rc1 => 4.02.3+dev
2015-07-10 17:41 doligez Target Version 4.02.3+dev => 4.03.0+dev / +beta1
2016-04-18 14:59 doligez Target Version 4.03.0+dev / +beta1 => 4.03.1+dev
2017-02-16 14:01 doligez Target Version 4.03.1+dev => undecided
2017-02-23 16:43 doligez Category OCaml standard library => standard library
2017-03-15 13:59 doligez Target Version undecided =>
2017-03-15 14:16 dbuenzli Note Added: 0017673


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker