Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buffer.add_substitute is not surjective and its documentation is incomplete. #6732

Closed
vicuna opened this issue Dec 22, 2014 · 9 comments · Fixed by #12423
Closed

Buffer.add_substitute is not surjective and its documentation is incomplete. #6732

vicuna opened this issue Dec 22, 2014 · 9 comments · Fixed by #12423

Comments

@vicuna
Copy link

vicuna commented Dec 22, 2014

Original bug ID: 6732
Reporter: @damiendoligez
Status: acknowledged (set by @damiendoligez on 2015-01-09T17:22:48Z)
Resolution: open
Priority: low
Severity: minor
Version: 4.02.1
Category: standard library
Monitored by: @hcarty @dbuenzli

Bug description

The documentation of Buffer.add_substitute says:

An escaped [$] character is a [$] that immediately follows a backslash
character; it then stands for a plain [$].

This is not true if the $ follows an even number of backslashes.

It also says:

Raise [Not_found] if the closing character of a parenthesized variable
cannot be found.

But Not_found is also be raised if there is an unescaped $ at the end of the string.

The documentation also fails to state that, when Not_found is raised, some characters are still added to the buffer. It also omits the fact that the adding stops silently when it sees an unescaped $ followed by anything else than alphanum, _, { or (.

Finally, the only way to get a single backslash followed by a dollar is by playing with variables, there is no way to get it from a constant string because the quoting conventions are incomplete: backslashes should also be escaped by backslashes.

Some of these problems should be fixed, and the others documented.

Steps to reproduce

OCaml version 4.02.1

# let b = Buffer.create 0;;
val b : Buffer.t = <abstr>
# Buffer.add_substitute b (fun _ -> "") "\n1. \\$";;
- : unit = ()
# Buffer.add_substitute b (fun _ -> "") "\n2. \\\\$";;
Exception: Not_found.
# Buffer.add_substitute b (fun _ -> "") "\n3. \\\\$ ";;
- : unit = ()
# Buffer.add_substitute b (fun _ -> "") "\n3. \\\\$!";;
- : unit = ()
# Buffer.add_substitute b (fun _ -> "") "\n4. \\\\\\$";;
- : unit = ()
# Buffer.add_substitute b (fun _ -> "") "\n5. $}";;
- : unit = ()
# print_endline (Buffer.contents b);;
1. $
2. \\
3. \\
3. \\
4. \\$
5. 
- : unit = ()
#
@vicuna
Copy link
Author

vicuna commented Dec 23, 2014

Comment author: @alainfrisch

My hope was we could instead get rid of this ad hoc function, which tastes very non-stdlib-y and accounts for more than half of buffer.ml lines. But to my great surprise people actually seem to use Buffer.add_substitute (e..g in Cmdliner or in Bisect) :-( Btw, Cmdliner's use is rather amusing, with a double call to expand $(i,$(mname)). If we are to fix the function, shouldn't we treat this directly? And why should this be tied to Buffer? People are likely to create a fresh buffer just for calling add_substitute and immediately extract its content.

@vicuna
Copy link
Author

vicuna commented Dec 23, 2014

Comment author: @dbuenzli

I was indeed particulary lazy on that one. OTOH I'm glad that this code is not in the already too long Cmdliner module...

@vicuna
Copy link
Author

vicuna commented Dec 23, 2014

Comment author: @damiendoligez

My secret hope was also to get rid of it, but...

AFAICT, it's in Buffer because it's going to use a buffer internally, so it must depend on Buffer and cannot be in String.

@dbuenzli, how bad do you think it would it be if we changed the quoting conventions to something sane but not backward-compatible? I'm thinking of getting rid of the broken backslash-escape thing and just replace "$$" with "$", so there would be no special character besides '$'.

@vicuna
Copy link
Author

vicuna commented Dec 23, 2014

Comment author: @dbuenzli

Cmdliner wouldn't be affected I think. The deranged use I make of this feature is to use variables of the form $(i,message) and $(b,message) to mark text as being italicized or bold. This comes from the fact that variables can be anything if they are enclosed by parens; from the docs:

an arbitrary sequence of characters enclosed by a pair of matching parentheses or curly brackets.

Another twist is that if the above message strings may contain variables themselves, which are then resolved a second time. Having $ in this is allowed per the above def. So if you make sure you are not touching the definition of variables I don't think Cmdliner would be affected (and would welcome the saner quoting conventions).

@vicuna
Copy link
Author

vicuna commented Jan 9, 2015

Comment author: @damiendoligez

I'm also thinking of getting rid of the exception case altogether: if $ is not followed by a variable or another $, just call the substitution function with an empty string, and let it choose whether to treat it as an error or not.

That would make the code simpler and more powerful.

@vicuna
Copy link
Author

vicuna commented Jan 10, 2015

Comment author: @dbuenzli

I don't see any problem with that in my code.

@vicuna
Copy link
Author

vicuna commented Mar 15, 2017

Comment author: @dbuenzli

Note that Cmdliner no longer uses Buffer.add_substitute at all.

@github-actions
Copy link

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

@github-actions github-actions bot added the Stale label May 11, 2020
@xavierleroy
Copy link
Contributor

A quick grep through OPAM sources suggest that the following 392 packages are using Buffer.add_substitute. As much as I would like to get rid of this function, I'm afraid this is not going to happen.

abt alberto aliases archimedes argot assertions async_core aws aws-autoscaling aws-cloudformation aws-cloudtrail aws-ec2 aws-elasticloadbalancing aws-sdb aws-ssm aws-sts bamboo bap bap-abi bap-api bap-arm bap-beagle bap-bil bap-build bap-bundle bap-byteweight bap-byteweight-frontend bap-c bap-cache bap-callsites bap-constant-tracker bap-core-theory bap-cxxfilt bap-dead-code-elimination bap-demangle bap-disassemble bap-dump-symbols bap-dwarf bap-elementary bap-elf bap-frames bap-frontc bap-frontend bap-fsi-benchmark bap-future bap-ida bap-ida-plugin bap-knowledge bap-llvm bap-main bap-mc bap-microx bap-mips bap-objdump bap-optimization bap-phoenix bap-piqi bap-plugins bap-powerpc bap-primus bap-primus-dictionary bap-primus-lisp bap-primus-powerpc bap-primus-region bap-primus-support bap-primus-test bap-primus-x86 bap-print bap-raw bap-recipe bap-recipe-command bap-relocatable bap-report bap-run bap-saluki bap-ssa bap-std bap-strings bap-symbol-reader bap-taint bap-taint-propagator bap-term-mapper bap-trace bap-traces bap-trivial-condition-form bap-warn-unused bap-x86 bare batteries bear bench bes bisect bisect_ppx bisect_ppx-ocamlbuild bitcoin bitvec bitvec-binprot bitvec-order bitvec-sexp bolt bookaml boomerang bos bracetax bsbnative bson bt Camldiets camldm camlhighlight caravan ccss clangml coccinelle commonjs_of_ocaml comparelib core_kernel crdt-ml cryptohash cryptokit custom_printf d3 deriving-yojson dnscurve electrumAnalyzer elf2json enumerate erm_xml erm_xmpp estring expect extlib extlib-compat extunix facebook-sdk faillib forkwork frag frama-c frama-c-base fstreams geoip getopt glicko2 graphlib grib gsasl hamt hardcaml-reedsolomon hdf hdfs herelib http_router hydro i2c ibx imaplet-lwt indexmap inotify io iso-filesystem itv-tree js-build-tools js-lz4 json-pointer json-predicate json-rpc jsoo_router KaSim krb5 lablqt lacc lambdoc libra-tk libres3 libvhd libzipperposition lilis links links-postgresql links-sqlite3 litiom llvmgraph lmdb logtk lutils lwt-binio lwt_camlp4 lwt_log lwt_named_threads lwt-zmq lymp lz4 lzo macaque_lwt maildir maki mariadb mbr-format merge-queues merge-ropes mesh-display meta_conv mirage-block-ccm mirage-btrees mirage-entropy-xen mirage-http-unix mirage-http-xen mirage-seal mirage-tc mirror mld modelica_ml monadlib monads mongo mosquitto mparser mqtt msgpack namespaces netlink netml nlopt-ocaml oasis oasis2debian oasis-mirage objsize obrowser ocaml-base-compiler ocaml-data-notation ocamlify ocaml-indent ocamlmod ocamlnet ocaml-sat-solvers ocaml-secondary-compiler ocaml-src ocaml-systemd ocamltter ocaml-variants ocaml-xdg-basedir ocaml-zmq ocapic oclaunch ocp-build ocp-manager ocp-pp odate ogre ojquery ojwidgets olmi omd oolc opa-base opam-build-revdeps opamfind opam-query openflow optcomp orakuda orsetto osm_xml ospec otto ounit ounit2 ounit2-lwt ounit-lwt ox pa_bench pa_bin_prot packet pa_comprehension pa_do pa_fields_conv pa_monad_custom pa_ounit pa_ovisitor pa_qualified pareto pa_sexp_conv pa_solution pa_structural_sexp pa_test patoline pa_typerep_conv pa_variants_conv pa_where pcf-format pci pci-db pgsolver pipebang planck podge posix-clock posix-getopt posix-math posix-mqueue posix-semaphore posix-time ppx_curried_constr ppx_debugger ppx_dotbracket ppx_implicits ppx_measure ppx_pattern_guard prob-cache proj4 promela pxp qbf qfs quickcheck rashell raygun4ocaml rdr reason receive-mail regular release relit-reason riak riak-pb river rlp rtime rtop semver sequoia sessions should sill simple-bmc skkserv-lite slack-backup slap sociaml-facebook-api sociaml-oauth-client sociaml-tumblr-api sociaml-vcard socketcan sparrow spatial_index spotify-cli spotinstall spotlib_js spreadsheet sqlgg sqlite3EZ stdcompat stemming swdogen TCSLib tdk testsimple text text-tags textwrap tftp themoviedb tiny_json_conv topkg topkg-care topology tptp transmission-rpc type_conv typehashlib typerex-build ucorelib udunits uwt vector3 vhdlib webdav wiringpi wyrd xapi-idl xapi-rrdd xapi-rrd-transport xenbigarray xen-block-driver xenctrl xen-disk xe-unikernel-upload zbar zipperposition zipperposition-tools

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants