Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

attachment of ocamldoc comments in the source tree is often wrong #7197

Closed
vicuna opened this issue Mar 27, 2016 · 18 comments
Closed

attachment of ocamldoc comments in the source tree is often wrong #7197

vicuna opened this issue Mar 27, 2016 · 18 comments
Assignees
Labels
Milestone

Comments

@vicuna
Copy link

vicuna commented Mar 27, 2016

Original bug ID: 7197
Reporter: craff
Assigned to: @lpw25
Status: resolved (set by @lpw25 on 2016-09-26T15:34:29Z)
Resolution: fixed
Priority: normal
Severity: minor
Platform: linux
OS: debian
OS Version: stable
Version: 4.02.3
Target version: 4.03.1+dev
Category: ~DO NOT USE (was: OCaml general)
Monitored by: @hcarty

Bug description

  • duplication when there are no newlines:

    in set.mli
    type t[@@ocaml.doc " The type of the set elements. "]
    val compare : t -> t -> int[@@ocaml.doc " The type of the set elements."]...

  • duplication in .ml even with newlines;

    in list.ml
    let fast_sort = stable_sort
    [@@@ocaml.text " sorting + removing duplicates "]
    let sort_uniq cmp l =
    ... a lot of code ...
    let len = length l in if len < 2 then l else sort len l[@@ocaml.text
    " sorting + removing duplicates "]

Steps to reproduce

ocamlc -c -dsource file.ml

Additional information

This part of ocaml code seems no worth it. Why not use [@@@ and [@@ only
and provide a script for the transition, while considering (** as
deprecated.

@vicuna
Copy link
Author

vicuna commented Mar 27, 2016

Comment author: @gasche

We have fixed issues that look similar in the development branch, so I suspect that these problems may be fixed in the 4.03 beta release (maybe you could check?).

I'm not sure how to reproduce the issue you point out or even whether your examples are actually reproduction cases. On my 4.02.3 system, running "ocamlc -c -dsource foo" on source files populated as you describe (that is, using @ocaml.doc or @ocaml.text) seems to show the same file when pretty-printed. Did you mean to use (**-style comments in your examples?

@vicuna
Copy link
Author

vicuna commented Mar 27, 2016

Comment author: craff

On more case: ";;" prevent attachment of comments:

example from lexing.mli:


val dummy_pos : position;;
(** A value of type [position], guaranteed to be different from any
valid position.
*)

(** {6 Lexer buffers} *)

result in:

val dummy_pos : position
[@@@ocaml.text " {6 Lexer buffers} "]

The comment is not attached as it should

@vicuna
Copy link
Author

vicuna commented Mar 27, 2016

Comment author: @gasche

Note that you can enable warning 50 (Unexpected documentation comment) to warn on ill-positioned comments.

I don't have input source to reproduce the first two reports. This last report is not a bug; since 4.02.2, documentation comments are parsed by the compiler itself, which uses stricter parsing rule than ocamldoc. The rules have been discussed in the Pull Request implementing the change,
#149
and in particular it was mentioned that ";;" prevents attachment of documentation comments at the end of the following message:
#149 (comment)

That said, it is problematic that there is no maintained documentation of these parsing rules besides this PR discussion. This should really be part of the manual, with a precise specification.

@vicuna
Copy link
Author

vicuna commented Mar 27, 2016

Comment author: craff

The three reported cases above also appends in 4.03.0+dev16-2016-02-29
I imagine that this impact merlin also ...

Personnaly, I think all this is ugly ... (** ... *) should be parsed
and not go through lexical analysis ... The current way, I imagine that is
will never work perfectly ?

@vicuna
Copy link
Author

vicuna commented Mar 27, 2016

Comment author: craff

This version: 4.04.0+dev0-2016-02-18
is also affected

@vicuna
Copy link
Author

vicuna commented Mar 27, 2016

Comment author: craff

4.03.0+beta1 too

@vicuna
Copy link
Author

vicuna commented Mar 27, 2016

Comment author: craff

Then, these are bugs in the stdlibs, I ran

ocamlc -w +50 on the .mli of the stdlibs ... This gives a lot of warnings...
I imagine merlin would benefit from fixing those.

And still, the behavior of ocaml on list.ml (with duplicated ocamldoc comments and no warning), is a bug.

@vicuna
Copy link
Author

vicuna commented Mar 27, 2016

Comment author: craff

I read the patch comment you cited (#149)

The following:
"Comments that are attached to more than one item are considered ambiguous
and are ignored"
is not respected. Currently, in this case the comment is attached to both.

(this appends on pervasives.mli for LOC and alike variables for instance)

attaching to first item seems more compatible to existing .ml/.mli files

@vicuna
Copy link
Author

vicuna commented Mar 27, 2016

Comment author: @lpw25

The "sorting + removing duplicates" one looks like a bug, I'll have a look at it this week. The others are behaving as expected (although, as Gabriel says the behaviour is not yet documented, now that merlin is using these attributes we should fix this -- I'll also try to look at that this week as well).

The main issue is that warning 50 was not turned on properly for the stdlib. I had thought that I turned it on and fixed things, but that work seems to have gotten lost somewhere.

Personnaly, I think all this is ugly ... (** ... *) should be parsed
and not go through lexical analysis ... The current way, I imagine that is
will never work perfectly ?

The issues described are unrelated to the mechanism used to collect the comments: they were deliberate choices to remove some of the inconsistency and ambiguity of the existing ocamldoc rules.

Using the parser to handle comments is not an option -- they rely on lexical features such as new lines and blank lines. If you bring blank lines and new lines into the token stream given to the parser you dramatically increase the complexity of the parser. It also means that incorrectly placed comments must be treated as a hard error rather than a warning. In addition it makes it very difficult to ensure that the parsing of OCaml code is not being affected by the placement of documentation comments -- a highly undesirable possibility. All in all leaving all comment handling to the lexer is much simpler -- quite similar in complexity to how locations are handled.

@vicuna
Copy link
Author

vicuna commented Mar 27, 2016

Comment author: craff

Argh, you are right ... only decap, and other scanner less technology can handle that this way ... It is so nice to put a "A - B" to mean to ignore space between A and B ... and parse newlines manually.

Thanks for the work.

To reach and check OCaml compatibility with decap, I would have two requests:

  • an up to date description of the intended behavior somewhere
  • an option to deactivate attachment of comments to the parse tree .

@vicuna
Copy link
Author

vicuna commented Mar 27, 2016

Comment author: @lpw25

The "sorting + removing duplicates" is addressed by:

#526

I'll leave this issue open until I've turned warning 50 on for the stdlib and updated the documentation.

an option to deactivate attachment of comments to the parse tree

That would be easy to add, but it's not clear to me why you would want to. Could you expand on why you would want this?

@vicuna
Copy link
Author

vicuna commented Mar 27, 2016

Comment author: craff

I want this, for people like me who write an alternative parser for ocaml (menhir and camlp4 might also be concerned). If we want to achieve identical parsing, up to ocamldoc comments, then it is easier to remove those comments before comparing the parse tree.

I agree, this is not a big issue ...

@vicuna
Copy link
Author

vicuna commented Mar 28, 2016

Comment author: @lpw25

I think for that for such a specific use case it would probably be better if you handled it yourself since the command-line is already pretty cluttered with rarely used arguments.

I can see two possible solutions:

  • If you run OCaml's parser yourself then you can set Lexer.handle_docstrings (added by PR#7075: fix repeated docstrings #348) to false to turn off the special handling of docstrings.

  • You could ignore "ocaml.doc" attributes when doing the AST comparison. (In my experience you often have to do this for locations anyway).

@vicuna
Copy link
Author

vicuna commented Mar 28, 2016

Comment author: craff

Yes, but I compare ast using -dparsetree where location are easy to ignore by
a regexp ... It is not so easy to find a regexpr for ocamldoc comments ...

@vicuna
Copy link
Author

vicuna commented Mar 28, 2016

Comment author: @lpw25

#528 and #529 fix a lot of the incorrectly attached comments in the standard library. The first removes all the unnecessary ;;s. The second turns on warning 50.

You might be interested in a little tool I wrote for comparing the output of two ocaml compilers on opam packages:

https://github.com/lpw25/compiler_eq

It has options to ignore all attributes and ignore all locations.

It basically just makes two opam switches for the compilers, installs packages on request, gets the ASTs out of the .cmt files of the packages and then compares them.

You might need to update the untypeast.ml file in it but it's just copied from the compiler's source.

@vicuna
Copy link
Author

vicuna commented Apr 14, 2016

Comment author: @damiendoligez

Is there anything more to do for this PR?

@vicuna
Copy link
Author

vicuna commented Apr 14, 2016

Comment author: @lpw25

I still need to update the documentation. I'll have a look at it tomorrow, or Monday at the latest.

@vicuna
Copy link
Author

vicuna commented Apr 15, 2016

Comment author: @lpw25

Pull request to update the manual:

#547

@vicuna vicuna closed this as completed Sep 26, 2016
@vicuna vicuna added this to the 4.03.1 milestone Mar 14, 2019
@vicuna vicuna added the bug label Mar 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants