Ocaml parser error are not meaningfull most of the time #5270

vicuna · 2011-05-19T15:40:02Z

Original bug ID: 5270
Reporter: nicolas_boulay
Status: acknowledged (set by @damiendoligez on 2012-06-27T13:08:03Z)
Resolution: open
Priority: normal
Severity: feature
Version: 3.12.0
Category: lexing and parsing
Child of: #5068
Monitored by: @gasche

Bug description

Ocaml parser did not give enought precise informations on typo error. If you forget a ';' or a 'in', the compilation finish with a "syntax error" pointing to the end of the file. The code shown is an example of code that can make loose a lot of time to debug, compare to typical other langage like java under Eclipse.
Don't forget that humain makes stupid mistakes :) Most of the time, this kind of code is debugged by commenting the code until it compiles.

There is also a problem on the type checker that gives the position of the first incoherency and not the "definition" place, where the bug could be. Both informations is needed most of the time. To debug this, each the "definition" place should be guess and verified4.

This unfriendliness behavior of the tools is a real pain for beginners.

Additional information

type df_t = {
mutable h: (val_t * string) list
mutable is_reverted: bool
}
(*
/!\ This list is used backward to be in 0(1) at the insertion. Then
everything as to be revert before printing.
*)

let add_last (ctx:df_t) (comment:string) (x:val_t) =
assert (not ctx.is_reverted);
ctx.is_reverted <- false;
ctx.h <- (x,comment)::ctx.h

let revert_before_printing (ctx:df_t) =
assert (not ctx.is_reverted);
ctx.is_reverted <- true;
ctx.h <- List.rev ctx.h

let add_df_footer ctx =
add_last ctx Int8(0x0E) "FOOTER";
add_last ctx Int24(0x000000) "UnusedPad"

vicuna · 2011-05-20T08:05:25Z

Comment author: @ygrek

for the first problem there is a (standard?) solution - camlp4o source.ml

vicuna · 2011-05-20T12:06:28Z

Comment author: @damiendoligez

I don't get it. ocamlc reports a syntax error at the second "mutable". It's obvious that a semicolon is missing right before that keyword.

Can you give an example where it reports a syntax error at the end of the file?

Also, for beginners my advice is to use ;; everywhere, it helps the parser figure out things.

For the typing errors, it might also be an inconsistency between two uses. It's a research problem to find the best way to display type errors. But most of the time, the hard-to-understand errors are in recursive definitions, and in that case you can annotate the definitions with the expected types and get good error reports.

Lastly, if you find it hard to track down the definition of a given identifier, the -annot option and the caml-types.el emacs file can help you.

vicuna · 2011-05-20T12:48:58Z

Comment author: nicolas_boulay

It's obvious only if you know that 'type' need a ';'. The error points the entire following line, but the error is just before.

Sorry, I can't find a way to reduce my error case where the error is reported at the end of the file. It was each time a missing ';' or a missing 'in'.

For the type error case, the problem is that you use a function before it's definition. So if you give a wrong argument, the error will be reported to the definition of the argument and not at the faulty call. I understood that the compiler did not know which of the 2 places are faulty. But why give you only the place of the incoherency instead of given the 2 places. The error message should juste add the line/column of both place, not only the last one. Otherwise you have to review all the call site of the function.

I haven't try yet the ocaml plugin for eclipse, but todays, java errors are pointed with red underline as you type, which give you a hudge productivity.

When you see a line/column with some error of ocaml, the error could be elsewhere and you don't have any hint of the kind of error. For me, it's like coming back of the year of gcc 2.95 where header error are reported in the C file there are included.

I did not know -annot use, caml-type.el, neither camlp4o, have you a place where you describe such tools to track tricky bugs ?

vicuna · 2011-05-23T15:02:05Z

Comment author: nicolas_boulay

Here is a new example of stupid typo, where you can loose lot of time :

let digit_of_unsigned (u:int) =
"u"

let string_of_u8 u:int =
assert(0<=u && u<=(1 lsl 8));
let u1 = u mod (1 lsl 4) in
let u2 = u / (1 lsl 4) in
(digit_of_unsigned u2) ^ (digit_of_unsigned u1) (ocamlc: File "essai2.ml", line 8, characters 2-49: Error: This expression has type string but an expression was expected of type int)

let copy_df_t (from:df_t) (_to:df_t) =
assert (not tfrom.is_reverted);
assert (not _to.is_reverted); (File "essai.ml", line 57, characters 2-4: Error: Syntax error)
to.n <- _to.n+from.n;
to.h <- from.h @ _to.h;*)
()
;;

vicuna · 2012-04-10T16:11:05Z

Comment author: @damiendoligez

For the first example, it's not a stupid typo. If you specify that string_of_u8 returns an int, you should not return a string.

For the second example, you should use an emacs mode that colors the keywords, then the problem becomes evident.

vicuna · 2012-04-11T15:12:04Z

Comment author: @pierreweis

As Damien said, please add a double semicolon at the end of each of your phrases. This way the parser would not run for ever to find the rest of the code.

Concerning you type error example, it seems to be clear enough

line 8, characters 2-4:
Error: This expression has type string but an expression was expected of type int

Since character 2-4 are exactly "u", I cannot understand what better message you are expecting from the compiler ?

Also, adding type annotation allover the place is not a very good idea in general: it is a waste of time and further data type modifications becomes cumbersome since you need to modifiy these annotations.

For your information:

``-annot'' is a compiler option to (automatically) annotate your program with types. This way you can read the types from the emacs editor with a simple key binding (C-x-t by default); this is of utmost help when debugging type errors.
caml-type.el is an emacs companion package to get this type annotations in emacs. You should add it to your .emacs init file.
camlp4 is a preprocessor/parser/pretty-printer for Caml. If you use option
-pp -camlp4o you get somewhat more relevant syntax error messages. You could use for instance

ocamlc -pp camlp4o file.ml

vicuna · 2012-04-11T15:55:53Z

Comment author: nicolas_boulay

Since 05-2011, i have done some progress in ocaml.

The stupid typo are always where you just edit your code, what ever the compiler said. So you need to compile often the code.

For my point of view type system is a way to declare things 2 times, and compiler check the coherency between the 2. An error is a contradiction between the 2. For a cristal clear error message, some thing as the following should be great : (ocamlc: File "essai2.ml", line 8, characters 2-49: Error: This expression has type string but an expression was expected of type int, file "essai2.ml", line 4, characters 20-23)

I understand that the compiler can't be magic, but it's always strange to have a reference in the source file, which is not where the error is. If you give both location, the error will be between the 2.

I have used the trick of using ";;" everywhere but some people said that this sign is deprecated. So it was not a good idea of keeping it.

When i wrote this feedback, it was to help you understand why beginners find ocaml code "hard to compile".

If you need many tools to debug effectively ocaml code, a dedicated manual should be written.

Nowadays, C compiler suggest the most common error done.

vicuna · 2012-04-12T07:33:30Z

Comment author: @pierreweis

The end of phrase marker ";;" is not deprecated.

It is mandatory for the interactive system and optional for source files.
It is convenient not only to restrict syntax error but also to write code that can be easily cut&paste into the interactive system for rapid testing and modification.

In short: feel free to keep it if you want, it never harms and can be useful sometimes.

vicuna · 2012-04-12T07:48:20Z

Comment author: @pierreweis

The OCaml type system was designed and developed to support programs with no type annotations at all. This is really useful and convenient: you may concentrate on the algorithms and data manipulated instead of thinking about the type of variables and expressions.

Many OCaml programmers never add type annotations to their programs, except in places where it is mandatory, namely in data type definitions and module interface declarations. Elsewhere, just let the compiler handle types.

In short: feel free to annotate your programs with the amount of type information you are comfortable with, knowing that it is not mandatory.

vicuna · 2012-04-12T07:57:25Z

Comment author: nicolas_boulay

I often use annotation to document the code when the name of the parameter could be confusing. This could be also the problem in .mli where only types are shown (i don't like comment that could be obsolete, without any compiler warning:). Maybe labels could help.

For information on new way to write code and algorithm, i think that the following presentation is very interesting. Look at it starting at 17'30.

It's like merging the interpreter and an editor, to show the effect of the code on some exemple.

https://www.youtube.com/watch?v=PUv66718DII at 17 min 30

vicuna · 2012-04-12T08:20:56Z

Comment author: @pierreweis

Last note: you're right about comments that could become obsolete with no compiler warnings. Mutatis mutandis, that's exactly why type annotations can be harmful in your programs: they can become obsolete, because you renamed a type or change its definition. But then the compiler will not transparently modify the type names in the annotations you wrote: you will have to maintain these useless type annotations. Not a big deal, but a real burden and useless waste of time!

vicuna · 2012-06-27T13:08:03Z

Comment author: @damiendoligez

For type errors, it's not just a question of reporting "both" places, because a type error is an inconsistency that can involve an arbitrary number of pieces of the program. So it's really not easy to find an algorithm that gives good error messages.

vicuna · 2012-06-27T13:47:26Z

Comment author: nicolas_boulay

Type annotation can not be false silently, being annoyed by the compiler is much less a problem rather than having false comment (beside the fact that ocaml type declation is heavier graphicaly than C type declaration).

You could have many place that have conflicting type for sur. But ocamlc stop at the first mismatch giving the expected type and the current type. You give the current type position, why not giving also the position seen for the expected type ?

From my external point of view, it's look like have a Lexing.position beside each type, and giving this value when printing the error.

vicuna · 2012-06-27T23:06:30Z

Comment author: @garrigue

You could have many place that have conflicting type for sur. But ocamlc stop at the first mismatch giving the expected type and the current type. You give the current type position, why not giving also the position seen for the expected type ?

Interesting idea.
This could probably be done to some extent, when we know where the expected type comes from.
But as Damien pointed out, in many situations the expected type is actually synthesized by type inference from multiple sources, and you cannot give a single location.
Worse, it is actually difficult to detect whether you are in this situation or not (except when a complete type annotation was given).
A stronger approach, like adding location information to every type node, would require rethinking completely the .cmi format, among other things.

I'll try to think about it.
By the way, this has become completely unrelated to parsing :-)

vicuna · 2012-06-28T07:31:04Z

Comment author: nicolas_boulay

Beginners always begins with simple feature. My ocaml code are simple module without functor or object. Most of the real type can be deducted from the .mli file. Reporting a place where you think the type is deduce will cover most of the stupid mistake. Don't try to cover 100 % of the cases, 80 % is already a big step.

For example, in a match clause, the first "|" clause defined the expected output type, if the second "|" is different, the error string will points the second case but the error could be in the first.

github-actions · 2020-05-15T04:21:03Z

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

vicuna added the lexing-and-parsing label Mar 14, 2019

vicuna mentioned this issue Mar 7, 2017

ocamlc/camlp4 should give better error messages for syntax errors #5068

Closed

vicuna added the feature-wish label Mar 20, 2019

github-actions bot added the Stale label May 15, 2020

github-actions bot closed this as completed Jun 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ocaml parser error are not meaningfull most of the time #5270

Ocaml parser error are not meaningfull most of the time #5270

vicuna commented May 19, 2011

vicuna commented May 20, 2011

vicuna commented May 20, 2011

vicuna commented May 20, 2011

vicuna commented May 23, 2011

vicuna commented Apr 10, 2012

vicuna commented Apr 11, 2012

vicuna commented Apr 11, 2012

vicuna commented Apr 12, 2012

vicuna commented Apr 12, 2012

vicuna commented Apr 12, 2012

vicuna commented Apr 12, 2012

vicuna commented Jun 27, 2012

vicuna commented Jun 27, 2012

vicuna commented Jun 27, 2012

vicuna commented Jun 28, 2012

github-actions bot commented May 15, 2020