Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0005564OCaml-for Camlp4 use 03:292016-12-07 18:27
Assigned Todim 
PlatformOSOS Version
Product Version 
Target VersionlaterFixed in Version 
Summary0005564: Unclear code in Camlp4/Struct/Lexer.mll
DescriptionIn camlp4/Camlp4/Struct/Lexer.mll,
line 458 ("succ n") reads only one character
from the stream instead of recursing.

450 let lexing_store s buff max =
451 let rec self n s =
452 if n >= max then n
453 else
454 match Stream.peek s with
455 | Some x ->
456 Stream.junk s;
457 buff.[n] <- x;
458 succ n
459 | _ -> n
460 in
461 self 0 s

Steps To ReproduceMaybe process a large source file and
check for performance problems.
Additional InformationProposed changes:
< let rec self n s =
> let rec self n =
< succ n
> self (succ n)
< self 0 s
> self 0
Attached Files

- Relationships

-  Notes
dim (developer)
2012-03-28 07:59

Actually we need to read only one token at a time for the toplevel since lexing a token may call Toploop.read_interactive_input. But maybe it is safe to read multiple tokens at at time only for files. I will look closer at this.

BTW, does the proposed change fix your performance issues ?
Drakken (reporter)
2012-03-28 08:18

This issue doesn't really affect me.
It just seems like a very obfuscated
way to read a single character.
gasche (developer)
2012-03-28 09:32

Drakken, I appreciate that people participate on the bug tracker, and I do think that there is something slightly odd with this code (the best way to know would probably to ask the original author) that *might* be the sign of something amiss, or would at least merit a cleanup or an explanatory comment.

However, this is really not a bug.

When reporting, please select the appropriate severity: "minor/major/block" is reserved to issues that are acutally bugs. "Feature" is for feature requests (that should be used with care) and "text", "tweak" and "trivial" are more appropriate for the rest ("text" for documentation issues, and tweak/trivial for the rest).
Drakken (reporter)
2012-03-28 10:07

Okay, I'll use "tweak" if I ever see something similar. Maybe it would help to have a category called something like "factoring" or "code quality". I knew the code was working, but it _looks_ like a mistake where the dev intended to read more chars. I would say the degree of obfuscation is significant, and "tweak" and "trivial" don't really suggest that.
Here's the code to read one character:

  let lexing_store s buff max =
    assert (max > 0);
    match Stream.peek s with
    | Some x ->
        Stream.junk s;
        buff.[0] <- x;
    | None -> 0
gasche (developer)
2012-03-28 11:05

I'm not convinced the "code quality" reports on the bug tracker are such a good idea. As I said, OCaml developpers tend to be extremely busy (nobody works full-time on OCaml development), and want to concentrate their development time on important matter (such as fixing real bugs and getting a release out).

Discussing small code readability issues on the bug tracker is simply too costly in the current model, because it will force those devs to be involved when they really should work on other things.

So I think that if you want to go on reading code and commenting on awkward places (which is, in itself, a good think), there should be another process that would not involve the bug tracker directly. For example, you could aggregate and publish your findings on a web page somewhere and contact a relevant developer¹ directly, so that they can be processed in batch modes and without sending tons of emails to most OCaml developpers.

¹: Who is "a relevant developer"? I'm not sure whether Jérémie, the current Camlp4 maintainer, is interested in such issues, but I suspect he might be. Otherwise you could contact me directly (gabriel dot scherer at gmail), but this would be rather low priority.

Besides, Camlp4 is a delicate and complex piece of software. Changes in this area have a tendency to break existing user extensions, and that is really not a good thing. If you want to suggest changes, you should test them extremely thoroughly (at least by going through Camlp4 bootstrap process; warning, this is not a small task). And ultimately most proposed changes would probably fall under the pragmatic rule of "if it ain't broke, don't fix it" -- though explanatory comments are not subject to this rule as they don't change the semantics at all.

Finally, if you are doing this with the laudable goal to familiarize yourself with the code, I must say there are other things that would be more immediately useful and on which you could help. Some of the bugs marked "acknowledged", "confirmed" or "resolved > suspended" could benefit from a patch proposal -- but some areas are more sensible than other. See for example [^]

On a strictly personal note, I believe that there are things to improve with Ocamlbuild that could benefit from external contributions. See the list of ocamlbuild bugs, most things could benefit from a proposed patch (including, I believe, those assigned to xclerc): [^]

If you're interested in Camlp4 specifically, I'm not overly confident in code changes, but there are certainly things to improve on the documentation side -- on the wiki: [^]
doligez (administrator)
2012-09-21 14:04

I have one remark to add to Gabriel's: if you want a change like that accepted, you have to demonstrate its utility: take your "Maybe process a large source file and check for performance problems.", remove the Maybe, and give us some performance numbers.
Otherwise, it's definitely "if it ain't broke, don't fix it", or at best "let's simplify the code and just read one character".
dim (developer)
2016-12-07 18:27

Camlp4 is now a separate project. Please re-open a ticket on github [1] if you are still interested in seeing the original issue fixed.

  [1] [^]

- Issue History
Date Modified Username Field Change
2012-03-28 03:29 Drakken New Issue
2012-03-28 07:59 dim Note Added: 0007206
2012-03-28 08:00 dim Assigned To => dim
2012-03-28 08:00 dim Status new => assigned
2012-03-28 08:18 Drakken Note Added: 0007207
2012-03-28 09:22 gasche Severity minor => tweak
2012-03-28 09:22 gasche Summary Lexer reads only one character at a time? => Unclear code in Camlp4/Struct/Lexer.mll
2012-03-28 09:32 gasche Note Added: 0007208
2012-03-28 10:07 Drakken Note Added: 0007209
2012-03-28 11:05 gasche Note Added: 0007210
2012-09-06 16:43 doligez Target Version => 4.00.1+dev
2012-09-21 14:04 doligez Note Added: 0008137
2012-09-21 14:04 doligez Priority normal => low
2012-09-21 14:04 doligez Target Version 4.00.1+dev => 4.01.0+dev
2013-06-17 16:03 gasche Resolution open => suspended
2013-06-17 16:03 gasche Target Version 4.01.0+dev => later
2013-12-05 15:40 doligez Tag Attached: patch
2016-02-08 12:49 doligez Category Camlp4 => for Camlp4 use [^]
2016-02-08 12:50 doligez Category for Camlp4 use [^] => for Camlp4 use [^]
2016-12-07 18:27 dim Note Added: 0016771
2016-12-07 18:27 dim Status assigned => closed
2017-02-23 16:43 doligez Category for Camlp4 use [^] => -for Camlp4 use [^]

Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker