Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ocamldoc: lexing: empty token #6069

Closed
vicuna opened this issue Jul 9, 2013 · 7 comments
Closed

ocamldoc: lexing: empty token #6069

vicuna opened this issue Jul 9, 2013 · 7 comments

Comments

@vicuna
Copy link

vicuna commented Jul 9, 2013

Original bug ID: 6069
Reporter: @ygrek
Assigned to: @zoggy
Status: closed (set by @zoggy on 2013-08-05T07:46:24Z)
Resolution: fixed
Priority: normal
Severity: minor
Version: 4.00.1
Target version: 4.01.0+dev
Category: ocamldoc
Related to: #4518
Monitored by: @gasche @hcarty

Bug description

Just stumbled upon the case when ocamldoc aborts with unhelpful error message. Consider :

$ cat q.ml
(** test )
(
* ok @@ )
(
* @deprecated @@ )
(
* ok [@@] *)
$ ocamldoc q.ml
/home/ygrek/work/arena/test/q.ml : lexing: empty token

1 error(s) encountered

In the real code it is hard to figure out what line causes an error. It appears all the code to track line numbers is in place, but not used when reporting errors. Simple patch below.

Additional information

diff --git a/ocamldoc/odoc_comments.ml b/ocamldoc/odoc_comments.ml
index ce96f2a..61b8f9a 100644
--- a/ocamldoc/odoc_comments.ml
+++ b/ocamldoc/odoc_comments.ml
@@ -90,7 +90,7 @@ module Info_retriever =
with
Failure s ->
incr Odoc_global.errors ;

  •               prerr_endline (file^" : "^s^"\n");
    
  •               Printf.eprintf "%s : line %d : %s\n%!" file (!Odoc_lexer.line_number + 1) s;
                  (0, None)
              | Odoc_text.Text_syntax (l, c, s) ->
                  incr Odoc_global.errors ;
    

File attachments

@vicuna
Copy link
Author

vicuna commented Jul 12, 2013

Comment author: @damiendoligez

It's not only the error message, but also this "empty token" situation that needs to be fixed.

@vicuna
Copy link
Author

vicuna commented Jul 23, 2013

Comment author: hnrgrgr

The attached patch introduces an explicit error message, saying: in arguments @-tags, others @-sign should be escaped.

@vicuna
Copy link
Author

vicuna commented Jul 23, 2013

Comment author: @gasche

There are several different things in this patch:
(1) a way to report line errors for lexing errors in ocamldoc
(2) a new lexing rule to accept single "@" in inputs instead of having a lexer failure
(3) an error message for this particular error

I think (1) could be done better, but that would require more effort. The patch format is an improvement over the previous format, but it does not respect the usual OCaml convention (so no emacs location parsing for example), and in particular it does not report character positions. The nice way to fix that would be to call "Location.curr lexbuf" at error points inside the lexer (as done in the usual OCaml parser), or at least when catching the Failure exception in odoc_comments.ml (but then you need to change this code so that you know which "lexbuf" value provoked the error).

I'd say (2) can be easily improved: instead of turning @foo+ into @foo* and then raise an error on the empty foo* case, why not just handle @ alone specifically? That would be more readable. It seems to me that the longest-match precedence rule of lexers should make that work nicely.

I'm not sure I understand (3). Why does it say "in @-tags arguments"? I didn't know about the problem before reading your patch, but I'd say that there is an error for any isolated @ in the comment, not only after, say, a @raise keyword (which expects arguments). Besides, it would be helpful if the error message mentioned explicitly mentioned how to escape (rather than just the need for an escape). I would suggest something like "The character @ has a special meaning in ocamldoc comments, for commands such as @raise or @SInCE. If you want to write a single @, you must escape it as @.".

(Do we really need to fail when we read @ followed by something else than a lowercase ? I think it is prudent to do so, as it leaves the door open to extending the lexical structure of @-commands in the future.)

@vicuna
Copy link
Author

vicuna commented Jul 24, 2013

Comment author: hnrgrgr

Well, my intention was just to track down the error. Hence, fixing point (2) was my sole intention. To improve readability, I've updated this part of the patch as suggested.

For point (3), I tried to be coherent with the ocamldoc manual:

http://caml.inria.fr/pub/docs/manual-ocaml/manual029.html#s:ocamldoc-tags

As said in the manual, there is not an error for every isolated @-sign, it appears only after the first @-tags. Still, one may probably improve the error message.

(Do we really need to fail when we read @ followed by something else than a lowercase ? I think it is prudent to do so, as it leaves the door open to extending the lexical structure of @-commands in the future.)

I agree.

I remove point (1) from the modified patch.

@vicuna
Copy link
Author

vicuna commented Jul 24, 2013

Comment author: @gasche

As said in the manual, there is not an error for every isolated @-sign

I'm not sure I understand the same thing from the documentation. If
you are referring to this part:

The inside of documentation comments (**…*) consists of free-form text
with optional formatting annotations, followed by optional tags giving
more specific information about parameters, version, authors, … The
tags are distinguished by a leading @ character. Thus, a documentation
comment has the following shape:

(** The comment begins with a description, which is text formatted
according to the rules described in the next section.
The description continues until the first non-escaped '@' character.
@author Mr Smith
@param x description for parameter x
*)

Then my understanding is that the first non-escaped '@', followed by
a command or otherwise, delimits the end of the description
section. There is no mention that non-escaped '@' would be allowed,
when put alone, in the description section (and in fact the
documentation comment example below would be wrong according to this
rule, because of the explicitly mentioned '@'. It should read "until
the first non-escaped @ character." to be consistent with the
described rule.

@vicuna
Copy link
Author

vicuna commented Aug 5, 2013

Comment author: @zoggy

Improved the error message, displaying also the unexpected character.
Rev 13974 in branch 4.O1.
Rev 13975 in trunk.

@vicuna vicuna closed this as completed Aug 5, 2013
@vicuna
Copy link
Author

vicuna commented Aug 5, 2013

Comment author: @zoggy

Oups, did not see the whole discussion before. Added the specific error message for not escaped '@'.
Rev 13976 in 4.01.
Rev 13977 in trunk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant