|Anonymous | Login | Signup for a new account||2013-05-26 00:06 CEST|
|Main | My View | View Issues | Change Log | Roadmap|
|View Issue Details|
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0005780||OCaml||OCaml typing||public||2012-10-09 18:07||2013-04-23 02:44|
|Target Version||4.01.0+dev||Fixed in Version|
|Summary||0005780: Strange type variable names in error messages (GADTs)|
|Description||Ocaml sometimes uses strange type variable names in error messages when type-checking GADTs (maybe to avoid name capture?).|
|Steps To Reproduce||Enter the following code:|
type _ t =
| B: bool -> bool t
| I: int -> int t
| A: ('a -> 'b) t * 'a t -> 'b t
| F: ('a -> 'b) -> ('a -> 'b) t
let rec eval: type s. s t -> s = function
| B b -> b
| I i -> i
| A (e, e') -> (eval e) e'
| F f -> f
Ocaml reports the following error:
| A (e, e') -> (eval e) e'
Error: This expression has type ex#0 t but an expression was expected of type
|Tags||No tags attached.|
This is a known problem that I've discussed with Jacques and I think could and should be improved before the next version.
The fundamental issue is that the OCaml syntax has no way to *name* the existential variables corresponding to a GADT constructor. The name generation it uses now ("ex#" + fresh integer) is particularly obnoxious, but in any case there is a type corresponding to your variable 'a in the type of the constructor A that has no name.
My personal opinion is that not having explicit binding of those variables in the syntax (in the constructors: (A : 'a. ('a -> 'b) * ... -> 'b t), and in the matching (| A a (e, e') -> ...)) is a defect of the GADT syntax -- but syntax is about hard compromises, and here the more concise syntax was chosen.
I believe the current behavior could be improved by:
- choosing names for existential that are more natural; I would like them to include either the constructor they come from, or the original user-chosen name of the corresponding type variable in the type definition, or both
- improving the wording of the error message to avoid the annoying "has type 'foo' but was expected at type 'foo'" and possible display something GADT-specific
In the meantime, this is not a bug (the behavior is perfectly correct) but a rather shortcoming on the user interface side. And it is not "sometimes", all existential variables have those unreadable names, and all escape situations generate those unreadable error messages.
Thanks for your report anyway: it's helpful to see that people care about this.
This is indeed a well known problem.
Actually part of it was a bug, as there was already some code to reuse variable names from the definition, but it didn't work properly.
This part is now fixed in trunk, at revision 12998.
The new error message is:
Error: This expression has type a#0 t but an expression was expected of type
Unfortunately, this is only a small improvement, as origin of the name is not so clear, and many existential types do not come directly from constructors.
As for the "#0" part, this is indeed to avoid name collisions.
As Gabriel mentioned, there has always been a plan to support naming of existential types in patterns.
However this requires new syntax, and this was delayed waiting for more experience.
This should be in the next release.
I'm open to improvements of the wording, but clever error reporting is very difficult.
I have added a patch in trunk/experimental/garrigue/pattern-local-types.diffs which allows to define types in patterns.
They are only allowed at the beginning of match cases, and the syntax is:
| type a b c. <pattern> -> <body>
They are _different_ from local abstract types, as they are not abstract, but inferred from their uses as type annotations inside <pattern>. If the inferred type is a fresh existential type, then some special manipulation is done to rename it to the given name, so that it will appear in error messages.
Here are some examples:
(* alias *)
function type a. (() : a) -> ( : a list);;
(function type a. (() : a) -> ( : a list)) ();;
function type a. () -> ();; (* fail *)
(* existential *)
type t = D : 'a * ('a -> int) -> t;;
let f = function type b. D ((x:b), f) -> (f:b->int) x;;
let f = function type b. D ((x:b), f) -> (f:t->int) x;; (* fail *)
edited on: 2012-10-12 10:21
To be clear to other potential readers of this bug item, the problem is that as there is currently no explicit binding syntax on GADT type *definitions*, the existential types are introduced in no particular order. If you define a constructor type K with two existential types 'a and 'b, we cannot use, say, the syntax (K (type a) (type b) (x, f) -> ...) because there is no notion of "first" and "second" existential type. So Jacques' choice here is to allow to introduce names for some types, that are later unified with the corresponding existential variable by additional annotations on the pattern arguments (in his example, (x:b) is necessary to link the fresh name b to the actual existential variable 'a of D).
I'm not really satisfied with this compromise, because the notations are still heavier than necessary (though arguably it helps readability to force to annotate use of the types) and we have yet another semantic distinction around type variable binding.
What about the following other compromise:
- allow the optional use of a type binding syntax at the GADT declaration site
type t = D : 'a . 'a * ('a -> int) -> t
(I'm not hung up on this particular syntax, for example it would be ok to have "type a . a * (a -> int) -> t" instead)
- use the "(type a)" syntax to locally introduce the type after the constructor name and before the value parameters, using the order in the definition
function D (type a) (x, f) -> f x
- for GADTs that have not used an explicit order in their definition, reject this syntax (or possibly use the left-to-right order of occurence of the variables, but this would be a hack)
I believe this gives us a reasonable transitional path to a more principle syntax, with ultimately lighter annotations at the matching site than what is possible with your suggestion.
(Of course I'm no expert on the grammatical ambiguities here so one of the syntax proposed may actually not be an option. The important idea is having explicit type binding syntax at both the definition and use site of the constructors.)
I'm actually not so sure that this use of the "(type a)" syntax is consistent with its current use (which is important: I don't want to introduce new semantic notion). Is it the case? Or should we prefer another syntax for existential types in GADT patterns?
(Where are we on the "(type a b)" sugar? Is it currently implemented?)
Well, I was also a promoter of explicit syntax for existentials in GADT definitions, so I cannot say I don't agree with Gabriel for the first part.
OK, your proposal makes sense, but it has the major drawback of requiring changes in type definitions, while we still allow the old format.
So if say, you are using a library with implicit existentials, you would have no way to use them.
Well, maybe using left-to-right order is okay, but we must be very careful about avoiding bugs (the internal representation must keep the syntactic order, which is not usually true for objects and polymorphic variants for instance.)
Also there is the question of how to handle row variables.
I think your use of (type a) is consistent (you do not really abstract an external type, since the existential is already abstract, and cannot leak anyway, but otherwise this is the same thing), so at least this part is okay.
There is currently not (type a b) sugar, but this would cost nothing.
As for a transition path for a more principled syntax, I would say either this is too late, or we must decide now to change the syntax and maybe break some existing programs, because otherwise it will be too late...
If I would give one argument for my approach, I would say that while it can be in some situations lightly more verbose, it is in general pretty light-weight, thanks to the inference-based approach.
Sebastien Furic (reporter)
And why not simply improving the error message by printing (for instance):
Error: This expression has type (type a) a t but an expression was expected of type
(type a) a
This would explicitly tell the user that there is some existential around. After all, my surprise essentially came from the strange format of the variable name and the fact that its abstract nature was not explicit in the message.
|2012-10-09 18:07||Sebastien Furic||New Issue|
|2012-10-10 10:32||gasche||Note Added: 0008232|
|2012-10-10 10:33||gasche||Severity||minor => text|
|2012-10-10 10:33||gasche||Status||new => acknowledged|
|2012-10-10 11:46||garrigue||Note Added: 0008237|
|2012-10-10 11:46||garrigue||Assigned To||=> garrigue|
|2012-10-10 11:46||garrigue||Status||acknowledged => confirmed|
|2012-10-12 03:17||garrigue||Note Added: 0008247|
|2012-10-12 10:19||gasche||Note Added: 0008249|
|2012-10-12 10:21||gasche||Note Edited: 0008249||View Revisions|
|2012-10-12 10:56||garrigue||Note Added: 0008250|
|2012-10-16 11:37||gasche||Target Version||=> 4.01.0+dev|
|2012-10-16 11:50||Sebastien Furic||Note Added: 0008262|
|2013-04-23 02:44||garrigue||Relationship added||child of 0005998|
|Copyright © 2000 - 2011 MantisBT Group|