New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strange type variable names in error messages (GADTs) #5780
Comments
Comment author: @gasche This is a known problem that I've discussed with Jacques and I think could and should be improved before the next version. The fundamental issue is that the OCaml syntax has no way to name the existential variables corresponding to a GADT constructor. The name generation it uses now ("ex#" + fresh integer) is particularly obnoxious, but in any case there is a type corresponding to your variable 'a in the type of the constructor A that has no name. My personal opinion is that not having explicit binding of those variables in the syntax (in the constructors: (A : 'a. ('a -> 'b) * ... -> 'b t), and in the matching (| A a (e, e') -> ...)) is a defect of the GADT syntax -- but syntax is about hard compromises, and here the more concise syntax was chosen. I believe the current behavior could be improved by:
In the meantime, this is not a bug (the behavior is perfectly correct) but a rather shortcoming on the user interface side. And it is not "sometimes", all existential variables have those unreadable names, and all escape situations generate those unreadable error messages. Thanks for your report anyway: it's helpful to see that people care about this. |
Comment author: @garrigue This is indeed a well known problem.
Unfortunately, this is only a small improvement, as origin of the name is not so clear, and many existential types do not come directly from constructors. As Gabriel mentioned, there has always been a plan to support naming of existential types in patterns. I'm open to improvements of the wording, but clever error reporting is very difficult. |
Comment author: @garrigue I have added a patch in trunk/experimental/garrigue/pattern-local-types.diffs which allows to define types in patterns.
They are different from local abstract types, as they are not abstract, but inferred from their uses as type annotations inside Here are some examples:
|
Comment author: @gasche To be clear to other potential readers of this bug item, the problem is that as there is currently no explicit binding syntax on GADT type definitions, the existential types are introduced in no particular order. If you define a constructor type K with two existential types I'm not really satisfied with this compromise, because the notations are still heavier than necessary (though arguably it helps readability to force to annotate use of the types) and we have yet another semantic distinction around type variable binding. What about the following other compromise:
I believe this gives us a reasonable transitional path to a more principle syntax, with ultimately lighter annotations at the matching site than what is possible with your suggestion. (Of course I'm no expert on the grammatical ambiguities here so one of the syntax proposed may actually not be an option. The important idea is having explicit type binding syntax at both the definition and use site of the constructors.) I'm actually not so sure that this use of the (Where are we on the |
Comment author: @garrigue Well, I was also a promoter of explicit syntax for existentials in GADT definitions, so I cannot say I don't agree with Gabriel for the first part. OK, your proposal makes sense, but it has the major drawback of requiring changes in type definitions, while we still allow the old format. As for a transition path for a more principled syntax, I would say either this is too late, or we must decide now to change the syntax and maybe break some existing programs, because otherwise it will be too late... If I would give one argument for my approach, I would say that while it can be in some situations lightly more verbose, it is in general pretty light-weight, thanks to the inference-based approach. |
And why not simply improving the error message by printing (for instance):
This would explicitly tell the user that there is some existential around. After all, my surprise essentially came from the strange format of the variable name and the fact that its abstract nature was not explicit in the message. |
Comment author: @gasche I'm not sure |
Comment author: @gasche However, I think the error messages of the form "This expression has type but an expression was expected of type ", with the same type printed at both places, are always harmful for non-experts and could be systematically skipped (or turned into a less confusing "There is a typing conflict at type ", not very informative but also prints the type). I don't know whether that would be easy to implement, though. |
Comment author: @alainfrisch A generic (?) work-around for the lack of naming facility is to introduce a local function with a locally abstract type: type t = D : 'a * ('a -> int) -> t
module type S = sig
type t
val x: t
val f: t -> int
end
let f = function
| D (arg0, arg1) ->
(fun (type b) (x : b) f ->
(module (struct type t = b let x = x let f = f end) : S)
)
arg0 arg1 What about considering that the introduced type names behave exactly like that, so that the code above could be written: let f = function
| type b. D ((x : b), f) ->
(module (struct type t = b let x = x let f = f end) : S) The syntax is the same as in Jacques' proposal, but the static semantics is different. I think it would be less confusing to the user, since locally named types currently always enforce genericity. |
Comment author: @lpw25 Since, in Alain's proposal the
is a regular locally abstract type (I think), could we have the syntax:
so that we can also allow:
because I often want to use a single-case GADT for existentials, and let feels more natural for them than match. |
Comment author: @alainfrisch
This would be a change to the current interpretation of locally abstract type. Here the "body" of the "new type" construction is not generic w.r.t. the new type b.
It is a locally abstract type "inside" the constructor (but its scope includes the right-hand side of the match clause). |
Comment author: @lpw25
I'm not sure that it is a fundamental change to the interpretation.
feels very similar to:
with the addition of some check that (type a) doesn't escape the scope of the existential. |
Comment author: @alainfrisch The type of:
has the form: This is plain polymorphic function, there is no existential involved here. This is quite different from:
which is monomorphic, and would be more similar to:
(i.e. when unpacking a module, we already get a way to refer to the existential type). |
Comment author: @lpw25
I understand that, but
is really equivalent to
the type system just isn't able to deduce this equality. But it could, if it
|
Comment author: @alainfrisch
The difference between the two is quite important; the existential is opened in one case by the function and in the other case by the caller. Since we are talking about naming the existential, this distinction is rather important. With your proposal, what would the following mean?
|
Comment author: @lpw25
That would be an error because the existential would escape its scope. It is equivalent to:
The point is
is equivalent to
under the condition that A is not in B. So it is perfectly reasonable to type
as
as long as we check that t does not escape its scope (i.e. is not mentioned in the rest of the type). This is precisely what
does. |
Comment author: @alainfrisch I still feel uncomfortable with the proposal, because there is an explicit binder for the existential (the GADT constructor), and the equivalence breaks this scope. |
Comment author: @gasche One way to understand Leo's proposal is to say that
first defines a rigid variable
works in a local context enriched with the equation Currently such GADT equations are only added by an annotation on the type of the whole GADT value. This does not allow to name existentials as they precisely don't appear in the outer GADT type. It seems this is a different mechanism from the one Jacques described for PS: I'm copying below the message from Oleg on caml-list about error messages for GADTs, which could be an interesting inspiration for useful error messages orthogonally to the naming of existentials discussed here.
|
Comment author: @garrigue I had dropped this discussion hoping that a natural solution would arise, Looking at the proposals so far, they all seem to be incoherent with the I've been discussing that with Jeremy Yallop, and we think we have a potential The syntax is nice, and the meaning seems clear.
|
Comment author: @lpw25
To me this looks too much like My preference would still be for:
|
Comment author: @garrigue As I explained in my previous answer If the similarity with 'a.'a is your problem, then maybe |
Comment author: @gasche What about (Ext (x : type a)) ? |
Comment author: @lpw25
Only if you think in terms of closed existential types. If we consider these to be open existential types (see "Modeling Abstract Types in Modules with Open Existential Types" by Montagu and Rémy) then:
is simply
So |
Comment author: @garrigue
type _ lam = ... | Abs : ('a -> 'b) lam * 'a lam -> 'b lam
Currently (type a) both introduces a new type and binds it (implicitly) at the point it is introduced. |
Comment author: @garrigue Actually, the binder of the existential is the constructor, not the variable, so However, the syntax for type annotations in patterns is devilish. That's what I'm starting to think that Gabriel's idea might end up being the cleanest. |
Comment author: @alainfrisch What about my proposal:
seen as an universal quantification on the whole branch (pattern + expression). |
Comment author: @garrigue Alain, I do not see the difference between your proposal and my original patch. The problem is that
is essentially different from
In the workaround code, you first pattern match on D, which introduces the existential variable in the environment, before it is matched against (type b). |
Comment author: @gasche Re. explicit ordering of existential variables in GADT constructor declarations, we may use either syntaxes: (1a) Prod : 'b 'c . 'b expr * 'c expr -> ('b * 'c) expr and (1b) Prod : type b c . b expr * c expr -> (b * c) expr And conversely in pattern position, (2a) Prod b c (e1, e2) -> ... or (if we decide that we really want the (type a b ...) syntax) (2b) Prod (type b c) (e1, e2) -> ... I personally like (2b) best (it would also be the least confusing for users, in the sense that someone that doesn't know about existential type would understand that the (type b c) part can be ignored in (2b), and be confused about (2a)), but for constructor declaration I have no preference between (1a) and (1b) ((1a) is the more incremental choice though), and I think we might want to allow both. |
Comment author: @alainfrisch I'm also in favor of explicit ordering in declarations. We should probably allow 1a anyway, and I'm undecided about 1b. In pattern position, 2a would be very confusing; 2b is much better. |
Comment author: @lpw25
Yes, but it seems a reasonable one, and I think it is more important to avoid adding yet another type variable syntax. The semantics seem simple enough; a newtype is either used in a (single) existential pattern which binds it, or it may escape its scope in which case it is bound when it is introduced. If someone tries to do both they get an escaping scope error, which is what they would get even if they didn't name the existential. In a way it is not even a change in semantics. If you consider
to introduce a type
to introduce a type |
Comment author: @lpw25
This syntax should be allowed anyway (same goes for val x : 'a . 'a -> 'a).
Personally I don't particularly like adding a form of explicit type argument, which is what this proposal amounts to. I also dislike giving the ordering of type variables a meaning. Does this means that
will be incompatible with
? |
Comment author: @alainfrisch
I guess so, but why would this be a problem? |
Comment author: @lpw25 Note that separating introduction from binding could also naturally support or-patterns:
|
Comment author: @lpw25
It's just unexpected. In a language without explicit type specialisation the order of type arguments should be irrelevant. |
Comment author: @gasche I agree that the ordering-meaningfulness is slightly unpleasant, but you can always annotate the value arguments of the pattern with the bound names, so that ordering mistakes are immediately caught. The fact that this annotation is not mandatory means that this syntax is in general lighter. Finally, you can also support or-patterns (but in a more verbose way): |
Comment author: @garrigue Concerning the syntax for explicit arguments, (1a) (2b) are clearly the least surprising ones, and it should always be our goal. However, I would strongly prefer writing only existential parameters. I know that it is not like Coq, but in ocaml type variables Having a look at CamlinternalFormatBasics is good to give you an Of course we must be future proof, so it is worth thinking about potential |
Comment author: @lpw25 I think that
is better than
because it is much more clear that It is also consistent with current uses of |
Comment author: @garrigue I kind of agree with you in principle. Also, at some point it might be a good thing to recognize the specificity of existential type variables. |
Comment author: @lpw25
Out of interest, why can't we propagate expected type? Even if we do not propagate the expected type, surely we only need to set Assuming we do need both annotations, I still think it is worth being explicit. In this case I think that clarity is more important than concision, especially since the syntax will probably not be used very often.
That would indeed be nice. Is there a grammar conflict, or could it be allowed? |
Comment author: @garrigue
The problem is that the semantics of locally abstract types is that they hide their original definition, rather than being equal to it. The advantage of the positional notation is that you are no longer introducing a locally abstract type hiding something, but just naming the existential variable.
To me this just sounds like saying that you should annotate all your functions with their complete types. |
Comment author: @lpw25
I can see how that would make connecting the inner and outer types during pattern matching difficult. However, I was rather suggesting connecting the inner type of
I don't think this is the same thing. Annotating your function types is about adding clarity to a particular piece of code by being specific about what type it has. I'm talking about adding consistency to a piece of syntax by having If you know about newtypes then it is immediately clear what:
means, whereas this
looks like an unused newtype declaration. |
Comment author: @yallop
The difficulty is that (x, y : t) currently means ((x, y) : t) |
Comment author: @garrigue
This amounts to changing the semantics. add_gadt_equation is only used for equations enforced by the GADT itself. This is not the case here.
It does. This is why I am so careful about the positioning of (type a). My point is that, to completely keep the same semantics, you need either to use a positional approach, or to be extremely verbose (try annotating camlinternalFormatBasics for fun...)
Until you learn that (type a) immediately after a non-constant constructor in a pattern is a binding occurrence. My proposal of making explicit only existentials in the declaration might be more controversial. My justification is that distinguishing existentials will actually help the user, even if this is superficially incoherent with other uses of the universal notation. |
Comment author: @gasche
Indeed, I think this should be done very carefully. One idea that I think would be reasonable is to allow people to reuse the parameter names without quantification: type 't expr = Note however that this means Prod's parameters would still have to be quantified: | Prod : type a b . a expr * b expr -> (a * b) expr I don't understand why we could do without introducing those explicitly. Jacques says:
but consider for example type 'a expr = I can indeed mention the types 'b and 'c in the case where the input type is restricted to the ('b * 'c) expr case: (* works *) (* also works *) (* also works, though I'm not fully sure why *) but in the general case, I cannot get hold of the 'b, 'c variables: (* works *) (* fails *) |
Comment author: @garrigue Ok, I see where the confusion comes form. type 'a expr = assuming that here the constraint is local to Prod. So 'b and 'c are existentials too. Allowing to reuse parameter names introduces strange scoping rules, so I'm not too enthusiastic about it. |
Comment author: @lpw25
I'm only suggesting it as a means of type-checking From this perspective, there is no observational difference between adding a GADT equation and making
But that's my point, it is one more thing users must know in order to read OCaml. |
Comment author: @garrigue Partial fix committed in trunk at revision 16411. Names starting with $ are existentials. Not that I didn't use #, because it is already used for object types. This does not replace a way to explicitly give names, but should be a bit clearer. |
Comment author: @alainfrisch The original problem (error message) has been fixed. The more ambitious proposal to give a way to name existential variables should be discussed in a different ticket. |
Original bug ID: 5780
Assigned to: @garrigue
Status: closed (set by @xavierleroy on 2017-02-16T14:18:35Z)
Resolution: fixed
Priority: normal
Severity: text
Version: 4.00.1
Target version: later
Fixed in version: 4.03.0+dev / +beta1
Category: typing
Related to: #7074
Child of: #5998
Monitored by: @lpw25 @yallop @hcarty
Bug description
Ocaml sometimes uses strange type variable names in error messages when type-checking GADTs (maybe to avoid name capture?).
Steps to reproduce
Enter the following code:
type _ t =
| B: bool -> bool t
| I: int -> int t
| A: ('a -> 'b) t * 'a t -> 'b t
| F: ('a -> 'b) -> ('a -> 'b) t
let rec eval: type s. s t -> s = function
| B b -> b
| I i -> i
| A (e, e') -> (eval e) e'
| F f -> f
;;
Ocaml reports the following error:
Error: This expression has type ex#0 t but an expression was expected of type
ex#0
The text was updated successfully, but these errors were encountered: