Anonymous | Login | Signup for a new account | 2015-12-01 15:43 CET |

Main | My View | View Issues | Change Log | Roadmap |

View Issue Details [ Jump to Notes ] | [ Issue History ] [ Print ] | ||||||||||

ID | Project | Category | View Status | Date Submitted | Last Update | ||||||

0005780 | OCaml | OCaml typing | public | 2012-10-09 18:07 | 2015-09-09 10:59 | ||||||

Reporter | Sebastien Furic | ||||||||||

Assigned To | garrigue | ||||||||||

Priority | normal | Severity | text | Reproducibility | always | ||||||

Status | assigned | Resolution | open | ||||||||

Platform | OS | OS Version | |||||||||

Product Version | 4.00.1 | ||||||||||

Target Version | 4.03.0+dev | Fixed in Version | |||||||||

Summary | 0005780: Strange type variable names in error messages (GADTs) | ||||||||||

Description | Ocaml sometimes uses strange type variable names in error messages when type-checking GADTs (maybe to avoid name capture?). | ||||||||||

Steps To Reproduce | Enter the following code: type _ t = | B: bool -> bool t | I: int -> int t | A: ('a -> 'b) t * 'a t -> 'b t | F: ('a -> 'b) -> ('a -> 'b) t let rec eval: type s. s t -> s = function | B b -> b | I i -> i | A (e, e') -> (eval e) e' | F f -> f ;; Ocaml reports the following error: Characters 94-96: | A (e, e') -> (eval e) e' ^^ Error: This expression has type ex#0 t but an expression was expected of type ex#0 | ||||||||||

Tags | No tags attached. | ||||||||||

Attached Files | |||||||||||

Notes | |

(0008232) gasche (developer) 2012-10-10 10:32 |
This is a known problem that I've discussed with Jacques and I think could and should be improved before the next version. The fundamental issue is that the OCaml syntax has no way to *name* the existential variables corresponding to a GADT constructor. The name generation it uses now ("ex#" + fresh integer) is particularly obnoxious, but in any case there is a type corresponding to your variable 'a in the type of the constructor A that has no name. My personal opinion is that not having explicit binding of those variables in the syntax (in the constructors: (A : 'a. ('a -> 'b) * ... -> 'b t), and in the matching (| A a (e, e') -> ...)) is a defect of the GADT syntax -- but syntax is about hard compromises, and here the more concise syntax was chosen. I believe the current behavior could be improved by: - choosing names for existential that are more natural; I would like them to include either the constructor they come from, or the original user-chosen name of the corresponding type variable in the type definition, or both - improving the wording of the error message to avoid the annoying "has type 'foo' but was expected at type 'foo'" and possible display something GADT-specific In the meantime, this is not a bug (the behavior is perfectly correct) but a rather shortcoming on the user interface side. And it is not "sometimes", all existential variables have those unreadable names, and all escape situations generate those unreadable error messages. Thanks for your report anyway: it's helpful to see that people care about this. |

(0008237) garrigue (manager) 2012-10-10 11:46 |
This is indeed a well known problem. Actually part of it was a bug, as there was already some code to reuse variable names from the definition, but it didn't work properly. This part is now fixed in trunk, at revision 12998. The new error message is: Error: This expression has type a#0 t but an expression was expected of type a#0 Unfortunately, this is only a small improvement, as origin of the name is not so clear, and many existential types do not come directly from constructors. As for the "#0" part, this is indeed to avoid name collisions. As Gabriel mentioned, there has always been a plan to support naming of existential types in patterns. However this requires new syntax, and this was delayed waiting for more experience. This should be in the next release. I'm open to improvements of the wording, but clever error reporting is very difficult. |

(0008247) garrigue (manager) 2012-10-12 03:17 |
I have added a patch in trunk/experimental/garrigue/pattern-local-types.diffs which allows to define types in patterns. They are only allowed at the beginning of match cases, and the syntax is: | type a b c. <pattern> -> <body> They are _different_ from local abstract types, as they are not abstract, but inferred from their uses as type annotations inside <pattern>. If the inferred type is a fresh existential type, then some special manipulation is done to rename it to the given name, so that it will appear in error messages. Here are some examples: (* alias *) function type a. (() : a) -> ([] : a list);; (function type a. (() : a) -> ([] : a list)) ();; function type a. () -> ();; (* fail *) (* existential *) type t = D : 'a * ('a -> int) -> t;; let f = function type b. D ((x:b), f) -> (f:b->int) x;; let f = function type b. D ((x:b), f) -> (f:t->int) x;; (* fail *) |

(0008249) gasche (developer) 2012-10-12 10:19 edited on: 2012-10-12 10:21 |
To be clear to other potential readers of this bug item, the problem is that as there is currently no explicit binding syntax on GADT type *definitions*, the existential types are introduced in no particular order. If you define a constructor type K with two existential types 'a and 'b, we cannot use, say, the syntax (K (type a) (type b) (x, f) -> ...) because there is no notion of "first" and "second" existential type. So Jacques' choice here is to allow to introduce names for some types, that are later unified with the corresponding existential variable by additional annotations on the pattern arguments (in his example, (x:b) is necessary to link the fresh name b to the actual existential variable 'a of D). I'm not really satisfied with this compromise, because the notations are still heavier than necessary (though arguably it helps readability to force to annotate use of the types) and we have yet another semantic distinction around type variable binding. What about the following other compromise: - allow the optional use of a type binding syntax at the GADT declaration site type t = D : 'a . 'a * ('a -> int) -> t (I'm not hung up on this particular syntax, for example it would be ok to have "type a . a * (a -> int) -> t" instead) - use the "(type a)" syntax to locally introduce the type after the constructor name and before the value parameters, using the order in the definition function D (type a) (x, f) -> f x - for GADTs that have not used an explicit order in their definition, reject this syntax (or possibly use the left-to-right order of occurence of the variables, but this would be a hack) I believe this gives us a reasonable transitional path to a more principle syntax, with ultimately lighter annotations at the matching site than what is possible with your suggestion. (Of course I'm no expert on the grammatical ambiguities here so one of the syntax proposed may actually not be an option. The important idea is having explicit type binding syntax at both the definition and use site of the constructors.) I'm actually not so sure that this use of the "(type a)" syntax is consistent with its current use (which is important: I don't want to introduce new semantic notion). Is it the case? Or should we prefer another syntax for existential types in GADT patterns? (Where are we on the "(type a b)" sugar? Is it currently implemented?) |

(0008250) garrigue (manager) 2012-10-12 10:56 |
Well, I was also a promoter of explicit syntax for existentials in GADT definitions, so I cannot say I don't agree with Gabriel for the first part. OK, your proposal makes sense, but it has the major drawback of requiring changes in type definitions, while we still allow the old format. So if say, you are using a library with implicit existentials, you would have no way to use them. Well, maybe using left-to-right order is okay, but we must be very careful about avoiding bugs (the internal representation must keep the syntactic order, which is not usually true for objects and polymorphic variants for instance.) Also there is the question of how to handle row variables. I think your use of (type a) is consistent (you do not really abstract an external type, since the existential is already abstract, and cannot leak anyway, but otherwise this is the same thing), so at least this part is okay. There is currently not (type a b) sugar, but this would cost nothing. As for a transition path for a more principled syntax, I would say either this is too late, or we must decide now to change the syntax and maybe break some existing programs, because otherwise it will be too late... If I would give one argument for my approach, I would say that while it can be in some situations lightly more verbose, it is in general pretty light-weight, thanks to the inference-based approach. |

(0008262) Sebastien Furic (reporter) 2012-10-16 11:50 |
And why not simply improving the error message by printing (for instance): Error: This expression has type (type a) a t but an expression was expected of type (type a) a This would explicitly tell the user that there is some existential around. After all, my surprise essentially came from the strange format of the variable name and the fact that its abstract nature was not explicit in the message. |

(0009511) gasche (developer) 2013-06-16 16:27 |
I'm not sure "(type a) a" more clearly indicates an existential type than "ex#23", and besides I'm not sure it's meaningful to close on all existentials (that may also be involved in some other part of the context) when printing the type. |

(0009512) gasche (developer) 2013-06-16 16:29 |
However, I think the error messages of the form "This expression has type <foo> but an expression was expected of type <foo>", with the same type printed at both places, are always harmful for non-experts and could be systematically skipped (or turned into a less confusing "There is a typing conflict at type <foo>", not very informative but also prints the type). I don't know whether that would be easy to implement, though. |

(0009698) frisch (developer) 2013-07-05 14:17 |
A generic (?) work-around for the lack of naming facility is to introduce a local function with a locally abstract type: ======================================================== type t = D : 'a * ('a -> int) -> t module type S = sig type t val x: t val f: t -> int end let f = function | D (arg0, arg1) -> (fun (type b) (x : b) f -> (module (struct type t = b let x = x let f = f end) : S) ) arg0 arg1 ======================================================== What about considering that the introduced type names behave exactly like that, so that the code above could be written: ======================================================== let f = function | type b. D ((x : b), f) -> (module (struct type t = b let x = x let f = f end) : S) ======================================================== The syntax is the same as in Jacques' proposal, but the static semantics is different. I think it would be less confusing to the user, since locally named types currently always enforce genericity. |

(0009700) lpw25 (developer) 2013-07-05 15:45 |
Since, in Alain's proposal the b in > | type b. D ((x : b), f) -> is a regular locally abstract type (I think), could we have the syntax: | (type b) D ((x : b), f) -> so that we can also allow: let f (type b) (D ((x: b), f)) = ... because I often want to use a single-case GADT for existentials, and let feels more natural for them than match. |

(0009701) frisch (developer) 2013-07-05 16:51 |
> let f (type b) (D ((x: b), f)) = ... This would be a change to the current interpretation of locally abstract type. Here the "body" of the "new type" construction is not generic w.r.t. the new type b. > | type b. D ((x : b), f) -> > is a regular locally abstract type (I think) It is a locally abstract type "inside" the constructor (but its scope includes the right-hand side of the match clause). |

(0009703) lpw25 (developer) 2013-07-05 17:35 |
> This would be a change to the current interpretation of locally abstract type. I'm not sure that it is a fundamental change to the interpretation. type ext = Ext : 'a -> ext let f (type a) (Ext(x : a)) = Ext([x] : a list) feels very similar to: module type S = sig type t val x : t end;; let f (type a) (module X : S with type t = a) = (module struct type t = a list let x = [X.x] end : S) with the addition of some check that (type a) doesn't escape the scope of the existential. |

(0009704) frisch (developer) 2013-07-05 17:53 |
The type of: let f (type a) (module X : S with type t = a) = ... has the form: (module S with type t = 'a) -> ... This is plain polymorphic function, there is no existential involved here. This is quite different from: let f (type a) (Ext(x : a)) = ... which is monomorphic, and would be more similar to: let f (module X : S) = .... X.t ... (i.e. when unpacking a module, we already get a way to refer to the existential type). |

(0009706) lpw25 (developer) 2013-07-05 18:14 |
> The type of: > > let f (type a) (module X : S with type t = a) = ... > > has the form: (module S with type t = 'a) -> ... I understand that, but 'a. (module S with type t = 'a) -> (module S) is really equivalent to (module S) -> (module S) the type system just isn't able to deduce this equality. But it could, if it checked whether 'a appeared elsewhere in the type and then safely removed it. This check is very similar to checking that the existential doesn't escape in: let f (type a) (Ext(x : a)) = ... |

(0009707) frisch (developer) 2013-07-05 18:48 |
> I understand that, but > 'a. (module S with type t = 'a) -> (module S) > is really equivalent to > (module S) -> (module S) The difference between the two is quite important; the existential is opened in one case by the function and in the other case by the caller. Since we are talking about naming the existential, this distinction is rather important. With your proposal, what would the following mean? fun (type a) (z : a) (Ext(x : a)) -> ... |

(0009708) lpw25 (developer) 2013-07-05 21:18 |
> With your proposal, what would the following mean? > > fun (type a) (z : a) (Ext(x : a)) -> ... That would be an error because the existential would escape its scope. It is equivalent to: # let f z (Ext x) = ignore [z; x]; true;; Error: This expression has type ex#0 but an expression was expected of type ex#0 The type constructor ex#0 would escape its scope The point is forall A. A -> B is equivalent to (exists A. A) -> B under the condition that A is not in B. So it is perfectly reasonable to type fun (exists t. (x : t)) -> ... as forall t. (fun (x : t) -> ... ) as long as we check that t does not escape its scope (i.e. is not mentioned in the rest of the type). This is precisely what let f (type a) (Ext(x : a)) = ... does. |

(0009725) frisch (developer) 2013-07-09 10:01 |
I still feel uncomfortable with the proposal, because there is an explicit binder for the existential (the GADT constructor), and the equivalence breaks this scope. |

(0009731) gasche (developer) 2013-07-09 15:07 |
One way to understand Leo's proposal is to say that fun (type a) (Ext(x : a)) -> ... first defines a rigid variable (a), and that the patterns (Ext(x : a)) first introduces an existential variable ex#N, and then upon processing the pattern (x : a) adds the local equation (a = ex#N) in the branch of the pattern clause. We are not naming (a) the actual existential variable, but working in a context where both are equivalent, just as let cast (type a) (Refl : (a, int) eq) = fun (x : a) -> (x : int) works in a local context enriched with the equation (a = int). Currently such GADT equations are only added by an annotation on the type of the whole GADT value. This does not allow to name existentials as they precisely don't appear in the outer GADT type. It seems this is a different mechanism from the one Jacques described for (type a . Ext (x : a)). Similar subtleties/hacks would probably be needed to get error messages to mention (a) rather than the existential names. PS: I'm copying below the message from Oleg on caml-list about error messages for GADTs, which could be an interesting inspiration for useful error messages orthogonally to the naming of existentials discussed here. > Just in case, here is the error message produced by GHC for the > equivalent example. > > Code: > > {-# LANGUAGE GADTs #-} > > data E a where > B :: Bool -> E Bool > A :: E (a -> b) -> E a -> E b > > eval :: E a -> a > eval (B x) = x > eval (A f x) = (eval f) x > > The error message: > > /tmp/ga.hs:9:25: > Couldn't match type `a1' with `E a1' > `a1' is a rigid type variable bound by > a pattern with constructor > A :: forall b a. E (a -> b) -> E a -> E b, > in an equation for `eval' > at /tmp/ga.hs:9:7 > In the second argument of `eval', namely `x' > In the expression: (eval f) x > In an equation for `eval': eval (A f x) = (eval f) x > > The gist is using the type variable name 'a' from the definition of > GADT, and attaching the numeric suffix to disambiguate. (GHC does that all > the time -- and I believe OCaml does something similar in many cases, > only not for GADTs.) What helps in particular I believe is quoting the > clause from the GADT declaration, so we can see where the name 'a' > comes from. So, perhaps if the error message can be improved with > giving more information about the error, the problem will be solved > without additional syntax? |

(0011646) garrigue (manager) 2014-06-04 08:50 |
I had dropped this discussion hoping that a natural solution would arise, but unfortunately a feature that was expected to be in 4.01 will not even be in 4.02. Looking at the proposals so far, they all seem to be incoherent with the current constructs (using a type construction in a pattern context, missing notion of parameter, or incompatible scoping rules). I've been discussing that with Jeremy Yallop, and we think we have a potential solution: (Ext (x : type a. a)) with the scope of "type a" being everything to its right. The syntax is nice, and the meaning seems clear. And since we are on the right of x, there is no longer the problem of rebinding the type. There are however a few difficulties 1) originally, in let f : type a. typ = expr1 in expr2 the scope of "type a" is expr1, whereas the pattern-matching equivalent of the above would be match expr1 with (f : type a. typ) -> expr2 which gives a different scope. Note however that the meaning of "let f : typ = expr" is "let f = (expr : typ)", not "let (f : typ) = expr", so there is no formal ambiguity. 2) The original "type a. typ" is defined as syntactic sugar, and forces the pattern to have a really polymorphic type. Here this is not the case since we want to match an existential type. This could be solved by rather using the syntax (Ext (x (type a) : a) This is ok too, since (type a) is to the right of x. However, this is less readable, and this does not solve point (1), i.e. there is the same potential confusion with "let f (type a) = expr1 in expr2". 3) There is some interaction between these abstract types appearing inside patterns and type propagation. Consider this pattern: (... (Ext (x : type a. a)) ... : a t) Independently of whether it will be typable or not (in many cases it wouldn't, since the type would leak), the scoping rules say that the "a" in "a t" is the type of x. For this reason, we need an extra traversal, to check which names are, and if a type annotation contains names bound in a subpattern, we must delay building the type to after typing the subpattern. |

(0011655) lpw25 (developer) 2014-06-04 20:07 |
> (Ext (x : type a. a)) To me this looks too much like `x : 'a. 'a` -- i.e. a completely universal type. My preference would still be for: (type a) (Ext (x : a)) |

(0011656) garrigue (manager) 2014-06-04 21:26 |
As I explained in my previous answer (type a) (Ext (x : a)) is incorrect from a scoping point of view: it tries to bind an existential that does not exist yet. If the similarity with 'a.'a is your problem, then maybe (Ext (x (type a) : a)) is better. |

(0011662) gasche (developer) 2014-06-04 22:06 |
What about (Ext (x : type a)) ? |

(0011664) lpw25 (developer) 2014-06-04 23:16 edited on: 2014-06-04 23:21 |
> As I explained in my previous answer > (type a) (Ext (x : a)) > is incorrect from a scoping point of view: it tries to bind an > existential that does not exist yet. Only if you think in terms of closed existential types. If we consider these to be open existential types (see "Modeling Abstract Types in Modules with Open Existential Types" by Montagu and Rémy) then: let (type a) (Ext (x : a)) = ext in ... is simply ν a. let (Ext y) = ext in let x = open(a) y in ... So `(type a)` is reduction and `(Ext (x : a))` is opening in the terminology of Montagu et al. |

(0011666) garrigue (manager) 2014-06-05 00:44 |
> What about (Ext (x : type a)) ? type _ lam = ... | Abs : ('a -> 'b) lam * 'a lam -> 'b lam > ? a. let (Ext y) = ext in let x = open(a) y in ... > So `(type a)` is reduction and `(Ext (x : a))` is opening in the terminology of Montagu et al. Currently (type a) both introduces a new type and binds it (implicitly) at the point it is introduced. So this would clearly be a change in semantics. |

(0011667) garrigue (manager) 2014-06-05 03:33 |
Actually, the binder of the existential is the constructor, not the variable, so Ext((type a) x : a) would be ok too. However, the syntax for type annotations in patterns is devilish. That's what you would have to write for Abs: Abs (((type a) f : a -> _), x) Ugly isn't it? I'm starting to think that Gabriel's idea might end up being the cleanest. Namely, the main drawback was when there are several existential variables. But in most cases, there is only one: Abs (type a) (f,x) is non-ambiguous. When there are several variables, as suggested before, we can use a syntax to give explicitly the existentials in the declaration. Also, contrary to what I though first, this approach makes it easier to handle private rows, without complex hacks involved. |

(0011669) frisch (developer) 2014-06-05 07:00 |
What about my proposal: ======================================================== let f = function | type b. D ((x : b), f) -> (module (struct type t = b let x = x let f = f end) : S) ======================================================== seen as an universal quantification on the whole branch (pattern + expression). |

(0011670) garrigue (manager) 2014-06-05 07:20 |
Alain, I do not see the difference between your proposal and my original patch. The problem is that let f = function | type b. D ((x : b), f) -> (module (struct type t = b let x = x let f = f end) : S) is essentially different from let f = function | D (arg0, arg1) -> (fun (type b) (x : b) f -> (module (struct type t = b let x = x let f = f end) : S) ) arg0 arg1 In the workaround code, you first pattern match on D, which introduces the existential variable in the environment, before it is matched against (type b). In the proposal version, type b is introduced before the pattern matching, so the original scoping rules do not allow to redefine an already introduced abstract type. I do not see how we can reconciliate the two different scoping approaches, short of introducing a new keyword. Otherwise, (type b) has to appear after D for coherence. |

(0011671) gasche (developer) 2014-06-05 08:17 |
Re. explicit ordering of existential variables in GADT constructor declarations, we may use either syntaxes: (1a) Prod : 'b 'c . 'b expr * 'c expr -> ('b * 'c) expr and (1b) Prod : type b c . b expr * c expr -> (b * c) expr And conversely in pattern position, (2a) Prod b c (e1, e2) -> ... or (if we decide that we really want the (type a b ...) syntax) (2b) Prod (type b c) (e1, e2) -> ... I personally like (2b) best (it would also be the least confusing for users, in the sense that someone that doesn't know about existential type would understand that the (type b c) part can be ignored in (2b), and be confused about (2a)), but for constructor declaration I have no preference between (1a) and (1b) ((1a) is the more incremental choice though), and I think we might want to allow both. |

(0011672) frisch (developer) 2014-06-05 09:29 |
I'm also in favor of explicit ordering in declarations. We should probably allow 1a anyway, and I'm undecided about 1b. In pattern position, 2a would be very confusing; 2b is much better. |

(0011673) lpw25 (developer) 2014-06-05 09:47 |
> Currently (type a) both introduces a new type and binds it (implicitly) at the point it is introduced. > So this would clearly be a change in semantics. Yes, but it seems a reasonable one, and I think it is more important to avoid adding yet another type variable syntax. The semantics seem simple enough; a newtype is either used in a (single) existential pattern which binds it, or it may escape its scope in which case it is bound when it is introduced. If someone tries to do both they get an escaping scope error, which is what they would get even if they didn't name the existential. In a way it is not even a change in semantics. If you consider let f (type a) x = x in ... to introduce a type `a` and not bind it, and let f (type a) (x : a) = x in ... to introduce a type `a` and bind it to the type of the first argument then we already have a semantics where `(type a)` introduces a type and `(... : a)` binds it. |

(0011674) lpw25 (developer) 2014-06-05 10:23 |
> Prod : 'b 'c . 'b expr * 'c expr -> ('b * 'c) expr This syntax should be allowed anyway (same goes for val x : 'a . 'a -> 'a). > Prod (type b c) (e1, e2) -> ... Personally I don't particularly like adding a form of explicit type argument, which is what this proposal amounts to. I also dislike giving the ordering of type variables a meaning. Does this means that type t = P : 'a 'b. 'a * 'b -> t will be incompatible with type t = P : 'foo 'bar. 'bar * 'foo -> t ? |

(0011676) frisch (developer) 2014-06-05 11:04 |
> Does this means that > type t = P : 'a 'b. 'a * 'b -> t > will be incompatible with > type t = P : 'foo 'bar. 'bar * 'foo -> t I guess so, but why would this be a problem? |

(0011677) lpw25 (developer) 2014-06-05 11:07 |
Note that separating introduction from binding could also naturally support or-patterns: type t = A : 'a * ('a -> int) -> t | B : 'a * ('a -> int) -> t let f (type a) (A((x : a), g) | B((x : a) , g)) = g x |

(0011678) lpw25 (developer) 2014-06-05 11:09 |
> I guess so, but why would this be a problem? It's just unexpected. In a language without explicit type specialisation the order of type arguments should be irrelevant. |

(0011682) gasche (developer) 2014-06-05 17:25 |
I agree that the ordering-meaningfulness is slightly unpleasant, but you can always annotate the value arguments of the pattern with the bound names, so that ordering mistakes are immediately caught. The fact that this annotation is not mandatory means that this syntax is in general lighter. Finally, you can also support or-patterns (but in a more verbose way): let f (A (type a) (x, g) | B (type a) (x, g)) = g x The usual scoping requirement that both sides must define the same variables applies to type "variables" as well. |

(0011689) garrigue (manager) 2014-06-05 23:42 edited on: 2014-06-05 23:46 |
Concerning the syntax for explicit arguments, (1a) (2b) are clearly the least surprising ones, and it should always be our goal. Note by the way that it should logically be Prod (type a) (type b) (x,y) the eventual (type a b) being only a shortcut. However, I would strongly prefer writing only existential parameters. The idea is that the non-existential parameters are already bound by the return type (and using a constraint is sufficient to be explicit). So actually in prod, no parameters would be explicit, for Abs since there is only one existential one could use the explicit syntax in the pattern without adding anything to the definition, or write it if desired. I.e. Abs : ('a -> 'b) lam * 'a lam -> 'b lam and Abs : 'a. ('a -> 'b) lam * 'a lam -> 'b lam would be equivalent, and Abs : 'a 'b. ('a -> 'b) lam * 'a lam -> 'b lam would be refused. And for Abs2, Abs : ('a1 -> 'a2 -> 'b) * 'a1 lam * 'a2 lam -> 'b lam would be ok but not allow explicit application, Abs : 'a1 'a2. ('a1 -> 'a2 -> 'b) * 'a1 lam * 'a2 lam -> 'b lam Abs : 'a2 'a1. ('a1 -> 'a2 -> 'b) * 'a1 lam * 'a2 lam -> 'b lam would be distinct since they allow explicit application, and both Abs : 'a1. ('a1 -> 'a2 -> 'b) * 'a1 lam * 'a2 lam -> 'b lam Abs : 'a1 'a2 'b. ('a1 -> 'a2 -> 'b) * 'a1 lam * 'a2 lam -> 'b lam would be refused. We would probably also need a new optional warning for implicit existentials in definition (might also apply to implicit polymorphic methods). I know that it is not like Coq, but in ocaml type variables can be kind of hidden (row variables for instance), and having a way to say explicitly what is existential and what is not would be helpful, aside of the fact that having to write all variables when only few of them are existential would be a pain. Having a look at CamlinternalFormatBasics is good to give you an idea of what concrete GADT constructors can be. There are indeed a number of cases with either one or two existentials in the definition of fmt. Note that in each case there are also at least 6 non-existential type variables, and in some cases up to 12. So requiring to write the non-existential variables looks like a real burden. Of course we must be future proof, so it is worth thinking about potential situations where we would need to be explicit about non-existential variables. I see none, but it may just be a lack of insight. |

(0011691) lpw25 (developer) 2014-06-06 09:52 edited on: 2014-06-06 12:23 |
I think that Ext(type a)(x : a) is better than Ext (type a) x because it is much more clear that `a` is being both introduced and bound. It is also consistent with current uses of `(type a)`. |

(0011695) garrigue (manager) 2014-06-06 13:27 edited on: 2014-06-06 13:28 |
I kind of agree with you in principle. However, concrete examples look more like Abs (type a) ((f : a -> b),(x : b)) vs Abs (type a) (f, x) Note that you need both annotations, because we cannot propagate the expected type. I wish we could at least omit the inner parentheses like in SML. Of course, you would still be free to write the extra annotations (this is precisely the goal of introducing names). Also, at some point it might be a good thing to recognize the specificity of existential type variables. |

(0011697) lpw25 (developer) 2014-06-06 15:48 |
> However, concrete examples look more like > Abs (type a) ((f : a -> b),(x : b)) > vs > Abs (type a) (f, x) > Note that you need both annotations, because we cannot propagate the expected type. Out of interest, why can't we propagate expected type? Even if we do not propagate the expected type, surely we only need to set `a` to be equal to `ex#1`? Can't this be inferred from a single annotation, similar to how GADT matching adds equations? Assuming we do need both annotations, I still think it is worth being explicit. In this case I think that clarity is more important than concision, especially since the syntax will probably not be used very often. > I wish we could at least omit the inner parentheses like in SML. That would indeed be nice. Is there a grammar conflict, or could it be allowed? |

(0011700) garrigue (manager) 2014-06-07 02:28 |
> Out of interest, why can't we propagate expected type? > Even if we do not propagate the expected type, surely we only need to set `a` to be equal to `ex#1`? > Can't this be inferred from a single annotation, similar to how GADT matching adds equations? The problem is that the semantics of locally abstract types is that they hide their original definition, rather than being equal to it. If you look at how Pexp_newtype is typed, you will see that a fresh abstract type is introduced in the environment, then the body is typed without any expected type, and ultimately the internal and external types are made equal. The reason it works so well is because type inference tells us all occurrences of the abstract type, so it is then sufficient to connect the two. However doing that during pattern matching would be much more complex, since there is not only the body to type, but also the remainder of the patterns. So the only solution I see for now would be to be fully explicit, otherwise you end up risking to mix external and internal types in the internal context, where they are incompatible. The advantage of the positional notation is that you are no longer introducing a locally abstract type hiding something, but just naming the existential variable. > Assuming we do need both annotations, I still think it is worth being explicit. In this case I think that clarity is more important than concision, especially since the syntax will probably not be used very often. To me this just sounds like saying that you should annotate all your functions with their complete types. Sure this may help catch some type errors earlier. But it doesn't make your type system any safer (at least in this case). And debugging types also works by adding the annotations later: once you get a type error, you may start annotating, and the more you annotate the surer you get. The fact that introducing annotations is cheap is an advantage. |

(0011701) lpw25 (developer) 2014-06-07 09:31 |
> The problem is that the semantics of locally abstract types is that they hide their original definition, rather than being equal to it. ... > However doing that during pattern matching would be much more complex, since there is not only the body to type, but also the remainder of the patterns. I can see how that would make connecting the inner and outer types during pattern matching difficult. However, I was rather suggesting connecting the inner type of `a` with the inner type of `ex#0` by adding a GADT equation (as in `add_gadt_equation` in ctype.ml). > To me this just sounds like saying that you should annotate all your functions with their complete types. > Sure this may help catch some type errors earlier. > But it doesn't make your type system any safer (at least in this case). I don't think this is the same thing. Annotating your function types is about adding clarity to a particular piece of code by being specific about what type it has. I'm talking about adding consistency to a piece of syntax by having `(type a)` mean the same thing in all contexts. With the positional approach, in some contexts it introduces a type and in others it binds a type. Sharing a syntax for these different operations is confusing. If you know about newtypes then it is immediately clear what: Abs (type a) ((f : a -> _), (x : a)) means, whereas this Abs (type a) (f, x) looks like an unused newtype declaration. |

(0011708) yallop (developer) 2014-06-09 12:20 |
>> I wish we could at least omit the inner parentheses like in SML. > That would indeed be nice. Is there a grammar conflict, or could it be allowed? The difficulty is that (x, y : t) currently means ((x, y) : t) |

(0011709) garrigue (manager) 2014-06-09 14:20 |
> I can see how that would make connecting the inner and outer types during pattern matching difficult. > However, I was rather suggesting connecting the inner type of `a` with the inner type of `ex#0` by adding a GADT equation (as in `add_gadt_equation` in ctype.ml). This amounts to changing the semantics. add_gadt_equation is only used for equations enforced by the GADT itself. This is not the case here. > I'm talking about adding consistency to a piece of syntax by having `(type a)` mean the same thing in all contexts. It does. This is why I am so careful about the positioning of (type a). The positional approach makes clear which existential variable is related to "a". Then you can perfectly choose to follow the hiding semantics, since all occurrences are "hidden" together. My point is that, to completely keep the same semantics, you need either to use a positional approach, or to be extremely verbose (try annotating camlinternalFormatBasics for fun...) > Abs (type a) (f, x) looks like an unused newtype declaration. Until you learn that (type a) immediately after a non-constant constructor in a pattern is a binding occurrence. The important part is that this behavior refines the usual rule, without introducing a contradiction. My proposal of making explicit only existentials in the declaration might be more controversial. My justification is that distinguishing existentials will actually help the user, even if this is superficially incoherent with other uses of the universal notation. |

(0011712) gasche (developer) 2014-06-09 16:19 |
> My proposal of making explicit only existentials > in the declaration might be more controversial. Indeed, I think this should be done very carefully. One idea that I think would be reasonable is to allow people to reuse the parameter names without quantification: type 't expr = | ... | App : type a . (a -> 't) expr * a expr -> 't expr Note however that this means Prod's parameters would still have to be quantified: | Prod : type a b . a expr * b expr -> (a * b) expr I don't understand why we could do without introducing those explicitly. Jacques says: > The idea is that the non-existential parameters are already bound by > the return type (and using a constraint is sufficient to be explicit). but consider for example type 'a expr = | Val of 'a | Prod : 'b expr * 'c expr -> ('b * 'c) expr I can indeed mention the types 'b and 'c in the case where the input type is restricted to the ('b * 'c) expr case: (* works *) let fst = function | Prod (b, c) -> b | Val (x, y) -> Val x (* also works *) let fst : type b c . (b * c) expr -> b expr = function | Prod (b, c) -> b | Val (x, y) -> Val x (* also works, though I'm not fully sure why *) let fst : ('b * 'c) expr -> 'b expr = function | Prod (b, c) -> b | Val (x, y) -> Val x but in the general case, I cannot get hold of the 'b, 'c variables: (* works *) let rec eval : type a . a expr -> a = function | Val v -> v | Prod (b, c) -> (eval b, eval c) (* fails *) let rec eval : type a . a expr -> a = function | Val v -> v | Prod ((b : 'b expr), (c : 'c expr)) -> (eval b, eval c) |

(0011714) garrigue (manager) 2014-06-09 23:24 |
Ok, I see where the confusion comes form. Using the equational approach, this is actually type 'a expr = | Val of 'a | Prod of 'b 'c. 'b expr * 'c expr constraint 'a = 'b * 'c assuming that here the constraint is local to Prod. So 'b and 'c are existentials too. Allowing to reuse parameter names introduces strange scoping rules, so I'm not too enthusiastic about it. Have to think more. |

(0011746) lpw25 (developer) 2014-06-13 19:17 |
>> I can see how that would make connecting the inner and outer types during pattern matching difficult. >> However, I was rather suggesting connecting the inner type of `a` with the inner type of `ex#0` by adding a GADT equation (as in `add_gadt_equation` in ctype.ml). > This amounts to changing the semantics. add_gadt_equation is only used for equations enforced by the GADT itself. This is not the case here. I'm only suggesting it as a means of type-checking `Abs (type a) (f, (x : a))` without having to annotate all uses of `a`. From this perspective, there is no observational difference between adding a GADT equation and making `ex#0` the external type of `a`, because any use of `a` which escapes the scope of the equation will result in an error anyway. > Until you learn that (type a) immediately after a non-constant constructor in a pattern is a binding occurrence. But that's my point, it is one more thing users must know in order to read OCaml. |

(0014425) garrigue (manager) 2015-09-09 10:59 |
Partial fix committed in trunk at revision 16411. Only made the names are more informative for now, following an idea by gasche. Names starting with $ are existentials. $Constr_'a denotes an existential introduced for the type variable 'a of the GADT constructor Constr. $Constr denotes an existential introduced for an anonymous variable in the GADT constructor Constr. $'a is an existential generated by reification of type variable 'a during unification. $n (n a number) is an internally generated existential. Not that I didn't use #, because it is already used for object types. This does not replace a way to explicitly give names, but should be a bit clearer. |

Issue History | |||

Date Modified | Username | Field | Change |

2012-10-09 18:07 | Sebastien Furic | New Issue | |

2012-10-10 10:32 | gasche | Note Added: 0008232 | |

2012-10-10 10:33 | gasche | Severity | minor => text |

2012-10-10 10:33 | gasche | Status | new => acknowledged |

2012-10-10 11:46 | garrigue | Note Added: 0008237 | |

2012-10-10 11:46 | garrigue | Assigned To | => garrigue |

2012-10-10 11:46 | garrigue | Status | acknowledged => confirmed |

2012-10-12 03:17 | garrigue | Note Added: 0008247 | |

2012-10-12 10:19 | gasche | Note Added: 0008249 | |

2012-10-12 10:21 | gasche | Note Edited: 0008249 | View Revisions |

2012-10-12 10:56 | garrigue | Note Added: 0008250 | |

2012-10-16 11:37 | gasche | Target Version | => 4.01.0+dev |

2012-10-16 11:50 | Sebastien Furic | Note Added: 0008262 | |

2013-04-23 02:44 | garrigue | Relationship added | child of 0005998 |

2013-06-16 16:27 | gasche | Note Added: 0009511 | |

2013-06-16 16:29 | gasche | Note Added: 0009512 | |

2013-07-05 14:17 | frisch | Note Added: 0009698 | |

2013-07-05 15:45 | lpw25 | Note Added: 0009700 | |

2013-07-05 16:51 | frisch | Note Added: 0009701 | |

2013-07-05 17:35 | lpw25 | Note Added: 0009703 | |

2013-07-05 17:53 | frisch | Note Added: 0009704 | |

2013-07-05 18:14 | lpw25 | Note Added: 0009706 | |

2013-07-05 18:48 | frisch | Note Added: 0009707 | |

2013-07-05 21:18 | lpw25 | Note Added: 0009708 | |

2013-07-09 10:01 | frisch | Note Added: 0009725 | |

2013-07-09 15:07 | gasche | Note Added: 0009731 | |

2013-08-19 15:40 | doligez | Target Version | 4.01.0+dev => 4.02.0+dev |

2014-06-04 08:50 | garrigue | Note Added: 0011646 | |

2014-06-04 20:07 | lpw25 | Note Added: 0011655 | |

2014-06-04 21:26 | garrigue | Note Added: 0011656 | |

2014-06-04 22:06 | gasche | Note Added: 0011662 | |

2014-06-04 23:16 | lpw25 | Note Added: 0011664 | |

2014-06-04 23:16 | lpw25 | Note Edited: 0011664 | View Revisions |

2014-06-04 23:18 | lpw25 | Note Edited: 0011664 | View Revisions |

2014-06-04 23:20 | lpw25 | Note Edited: 0011664 | View Revisions |

2014-06-04 23:20 | lpw25 | Note Edited: 0011664 | View Revisions |

2014-06-04 23:21 | lpw25 | Note Edited: 0011664 | View Revisions |

2014-06-05 00:44 | garrigue | Note Added: 0011666 | |

2014-06-05 03:33 | garrigue | Note Added: 0011667 | |

2014-06-05 07:00 | frisch | Note Added: 0011669 | |

2014-06-05 07:20 | garrigue | Note Added: 0011670 | |

2014-06-05 08:17 | gasche | Note Added: 0011671 | |

2014-06-05 09:29 | frisch | Note Added: 0011672 | |

2014-06-05 09:47 | lpw25 | Note Added: 0011673 | |

2014-06-05 10:23 | lpw25 | Note Added: 0011674 | |

2014-06-05 11:04 | frisch | Note Added: 0011676 | |

2014-06-05 11:07 | lpw25 | Note Added: 0011677 | |

2014-06-05 11:09 | lpw25 | Note Added: 0011678 | |

2014-06-05 17:25 | gasche | Note Added: 0011682 | |

2014-06-05 23:42 | garrigue | Note Added: 0011689 | |

2014-06-05 23:46 | garrigue | Note Edited: 0011689 | View Revisions |

2014-06-06 09:52 | lpw25 | Note Added: 0011691 | |

2014-06-06 09:53 | lpw25 | Note Edited: 0011691 | View Revisions |

2014-06-06 12:23 | lpw25 | Note Edited: 0011691 | View Revisions |

2014-06-06 13:27 | garrigue | Note Added: 0011695 | |

2014-06-06 13:28 | garrigue | Note Edited: 0011695 | View Revisions |

2014-06-06 15:48 | lpw25 | Note Added: 0011697 | |

2014-06-07 02:28 | garrigue | Note Added: 0011700 | |

2014-06-07 09:31 | lpw25 | Note Added: 0011701 | |

2014-06-09 12:20 | yallop | Note Added: 0011708 | |

2014-06-09 14:20 | garrigue | Note Added: 0011709 | |

2014-06-09 16:19 | gasche | Note Added: 0011712 | |

2014-06-09 23:24 | garrigue | Note Added: 0011714 | |

2014-06-13 19:17 | lpw25 | Note Added: 0011746 | |

2014-07-16 20:19 | doligez | Target Version | 4.02.0+dev => 4.02.1+dev |

2014-09-04 00:25 | doligez | Target Version | 4.02.1+dev => undecided |

2014-09-22 22:08 | doligez | Target Version | undecided => 4.02.2+dev / +rc1 |

2015-02-25 00:00 | doligez | Target Version | 4.02.2+dev / +rc1 => 4.03.0+dev |

2015-09-09 10:59 | garrigue | Note Added: 0014425 | |

2015-09-09 10:59 | garrigue | Status | confirmed => assigned |

Copyright © 2000 - 2011 MantisBT Group |