| Anonymous | Login | Signup for a new account | 2013-06-20 02:17 CEST | ![]() |
| Main | My View | View Issues | Change Log | Roadmap |
| View Issue Details [ Jump to Notes ] | [ Issue History ] [ Print ] | |||||||
| ID | Project | Category | View Status | Date Submitted | Last Update | |||
| 0004562 | OCaml | OCaml general | public | 2008-06-05 12:16 | 2009-04-29 20:44 | |||
| Reporter | premchai21 | |||||||
| Assigned To | weis | |||||||
| Priority | normal | Severity | minor | Reproducibility | always | |||
| Status | closed | Resolution | fixed | |||||
| Platform | OS | OS Version | ||||||
| Product Version | 3.10.2 | |||||||
| Target Version | Fixed in Version | 3.11.0+beta | ||||||
| Summary | 0004562: scanf produces wrong %n output value after integer conversion | |||||||
| Description | Comments added for clarity. $ ocaml Objective Caml version 3.10.2 # let g s = Scanf.sscanf s "%d%n" (fun i n -> (i, n));; val g : string -> int * int = <fun> # g "99";; - : int * int = (99, 2) (* Correct. *) # g "99 syntaxes all in a row";; - : int * int = (99, 3) (* Wrong. *) # g "-20 degrees Celsius";; - : int * int = (-20, 4) (* Also wrong. *) # for i = 32 to 126 do if ((i < 48) || (i >= 58)) && (i != 95) then let (i, n) = g ("42" ^ (String.make 1 (char_of_int i))) in if n != 3 then Printf.printf "Hmm: %d\n%!" n done;; - : unit = () (* Happens with all printable ASCII chars in [^0-9_]. *) # | |||||||
| Additional Information | This is on Debian unstable AMD64, version 3.10.2-3 of the "ocaml" package. A cursory glance at stdlib/scanf.ml makes me think that not only is it peeking a char and then erroneously counting that as part of the character count, but the Scanning stuff doesn't have a way to do otherwise, making this probably require a larger change to fix than I would have otherwise expected. :-( | |||||||
| Tags | No tags attached. | |||||||
| Attached Files | ||||||||
Notes |
|
|
(0004514) weis (developer) 2008-06-06 10:12 |
This is clearly a semantical issue, not a bug. |
|
(0004515) weis (developer) 2008-06-06 11:41 |
I think you overlooked the definition of the %n conversion; in the documentation for Scanf, it is stated as: - [n]: returns the number of characters read so far. If we accept this definition, %n is not supposed to give the number of characters of tokens, or even be related to the length of tokens: it just returns the number of characters that have been ``read so far'' to return those tokens. Hence, there is no errors in the examples you gave: ``the number of characters read so far'' to return the tokens you asked for are precisely those reported by the call to scanf. This behaviour is also briefly explained in a note of the documentation: Note: a scan may often require to examine one character in advance; when this ``lookahead'' character does not belong to the token read, it is stored back in the scanning buffer and becomes the next character read. A seminal example of this kind of scan that require a lookahead character is the very useful %0c conversion that means: ``test the current input character without reading it''. To let you examine the NEXT character to be read, the %0c conversion must read this character and stores it to be the next character to be read. This behaviour is not at all uncommon: in fact, almost all the conversions necessitate such a lookahead, %s, %d, %f, and so on. This is clear if asking an integer from the string "0123abc": scanf must read the character 'a' before stating that the number indeed ends at character '3' of the input. Hence after reading 123, the %n conversion returns the exact count of character read so far which is 5. # Scanf.sscanf "0123abc" "%i%n" (fun n count_for_n -> n, count_for_n);; - : int * int = (123, 5) Note also that reading a single character after the integer does not change the ``number of character read so far'', since there is no need to read any character more to find 'a': # Scanf.sscanf "0123abc" "%i%n%c%n" (fun n count_for_n c count_for_c -> n, count_for_n, c, count_for_c);; - : int * int * char * int = (123, 5, 'a', 5) |
|
(0004516) premchai21 (reporter) 2008-06-06 12:04 |
I do not see that note paragraph anywhere in http://caml.inria.fr/pub/docs/manual-ocaml/libref/Scanf.html [^] or in my local copy of the documentation. Where is that note located? Every C scanf implementation that I have seen defines %n to mean "number of characters read so far" with the semantics of "number of characters consumed that were used to match tokens or other parts of the format string, not including any lookahead characters read from the input stream". In the absence of a formal definition, the Caml documentation can reasonably be interpreted this way as well. It is also a much more common and useful case to require the number of characters matched without including any lookahead characters. Making the interpretation of lookahead (which is more an internal detail of the Scanf module) a necessary part of constructing the conversion strings and functions feels rather unclean. Even the note paragraph that you quote doesn't seem to contradict that idea; it states that when a lookahead character is stored back into the scanning buffer, it becomes the next character read. This to me implies that it has been _unread_ and is therefore no longer considered read as regards the _logical_ state of the scanner, even if one more character had to be physically read from the input in order to produce this state. |
|
(0004602) weis (developer) 2008-09-08 14:49 |
This is fixed in the current development version: Objective Caml version 3.11+dev15 # let g s = Scanf.sscanf s "%d%n" (fun i n -> (i, n));; val g : string -> int * int = <fun> # g "99";; - : int * int = (99, 2) # g "99 syntaxes all in a row";; - : int * int = (99, 2) # g "-20 degrees Celsius";; - : int * int = (-20, 3) So, now, the lookahead character is no more counted as read, even if it really has been. I agree with you that this semantics is clearer and more sound. |
Issue History |
|||
| Date Modified | Username | Field | Change |
| 2008-06-05 12:16 | premchai21 | New Issue | |
| 2008-06-06 10:12 | weis | Note Added: 0004514 | |
| 2008-06-06 10:12 | weis | Assigned To | => weis |
| 2008-06-06 10:12 | weis | Status | new => assigned |
| 2008-06-06 11:41 | weis | Note Added: 0004515 | |
| 2008-06-06 12:04 | premchai21 | Note Added: 0004516 | |
| 2008-09-08 14:49 | weis | Note Added: 0004602 | |
| 2008-09-08 14:50 | weis | Status | assigned => resolved |
| 2008-09-08 14:50 | weis | Resolution | open => fixed |
| 2009-04-29 20:44 | weis | Status | resolved => closed |
| 2009-04-29 20:44 | weis | Fixed in Version | => 3.11.0+beta |
| Copyright © 2000 - 2011 MantisBT Group |