Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0007799OCamlstandard librarypublic2018-05-31 17:002018-06-07 09:29
Reportergmelquiond 
Assigned Tonojebar 
PrioritynormalSeveritymajorReproducibilityalways
StatusresolvedResolutionfixed 
PlatformOSOS Version
Product Version4.06.1 
Target VersionFixed in Version 
Summary0007799: format_from_string and string_of_format
DescriptionI naively expect the functions format_from_string and string_of_format to be inverse of each other. But they are not, which leads to some obscure bugs when trying to use them. As far as I can tell, it is because format_from_string escapes double quotes but not backslashes before parsing the string. This is reproducible with all the versions of OCaml I have tested, from 4.02.3 to 4.07-beta.
Steps To Reproducelet test s = ignore (Scanf.format_from_string (string_of_format s) s);;
test "%s/%a";; (* OK *)
test "\\ ";; (* Exception: Scanf.Scan_failure "illegal escape character ' '". *)
test "\\x";; (* Exception: Scanf.Scan_failure "illegal escape character '\"'". *)
test "\\x25s";; (* Exception: Scanf.Scan_failure "bad input: format type mismatch between \"%s\" and \"\\\\x25s\"". *)
test "\\\"%s";; (* Exception: Scanf.Scan_failure "bad input: format type mismatch between \"\\\\\" and \"\\\\\\\"%s\"". *)
test "\\";; (* Exception: Scanf.Scan_failure "scanning of a String failed: premature end of file occurred before end of token". *)
TagsNo tags attached.
Attached Files

- Relationships

-  Notes
(0019148)
gasche (developer)
2018-05-31 17:31

Indeed, it appears that the Scanf.format_of_string implementation, which has its own implementation of string escaping, is buggy. The following would work correctly:

  let format_from_string s fmt =
    Scanf.sscanf_format (Printf.sprintf "%S" s) fmt (fun s -> s)

I can fix this in the stdlib, but you can also use the code above as a workaround for yourself.
(0019149)
gasche (developer)
2018-05-31 17:38

Note: since the Big Format Change of 4.02, there is also a function CamlinternalFormat.format_of_string_format, with the same interface as Scanf.format_from_string, which I believe behaves correctly -- its implementation is more directly than Scanf's. CamlinternalFormat functions are internal and not meant for external usage, but this is also a workaround.

I'm curious, what is your use-case for using these functions?
(0019150)
gmelquiond (reporter)
2018-05-31 18:00

Sometimes, operator '^^' is not sufficient to manipulate format strings, so you have to go back to plain strings to benefit from a lot more functions. In fact, unless I missed it, there is not even a function that takes a nonliteral string free of '%' and turns it into a plain format string (except for format_of_string, obviously).

Anyway, thanks for the suggestion of using Scanf.sscanf_format. It is much nicer than the one I had come up for Why3, which was to escape all the backslashes before feeding it to format_from_string.
(0019163)
nojebar (developer)
2018-06-06 14:12

Just as a follow-up, Scanf.format_from_string actually uses CamlinternalFormat.format_of_string_format. The problem is the implementation of escaping in Scanf.string_to_String which only escapes double quotes.

I submitted a PR with a variant of the fix suggested by Gabriel:

  https://github.com/ocaml/ocaml/pull/1820 [^]
(0019164)
nojebar (developer)
2018-06-07 09:29

PR merged

- Issue History
Date Modified Username Field Change
2018-05-31 17:00 gmelquiond New Issue
2018-05-31 17:31 gasche Note Added: 0019148
2018-05-31 17:38 gasche Note Added: 0019149
2018-05-31 18:00 gmelquiond Note Added: 0019150
2018-06-06 14:12 nojebar Note Added: 0019163
2018-06-06 14:12 nojebar Assigned To => nojebar
2018-06-06 14:12 nojebar Status new => assigned
2018-06-07 09:29 nojebar Note Added: 0019164
2018-06-07 09:29 nojebar Status assigned => resolved
2018-06-07 09:29 nojebar Resolution open => fixed


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker