Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0005780OCamlOCaml typingpublic2012-10-09 18:072014-09-22 22:08
ReporterSebastien Furic 
Assigned Togarrigue 
PrioritynormalSeveritytextReproducibilityalways
StatusconfirmedResolutionopen 
PlatformOSOS Version
Product Version4.00.1 
Target Versionafter-4.02.1Fixed in Version 
Summary0005780: Strange type variable names in error messages (GADTs)
DescriptionOcaml sometimes uses strange type variable names in error messages when type-checking GADTs (maybe to avoid name capture?).
Steps To ReproduceEnter the following code:

type _ t =
  | B: bool -> bool t
  | I: int -> int t
  | A: ('a -> 'b) t * 'a t -> 'b t
  | F: ('a -> 'b) -> ('a -> 'b) t

let rec eval: type s. s t -> s = function
  | B b -> b
  | I i -> i
  | A (e, e') -> (eval e) e'
  | F f -> f
;;

Ocaml reports the following error:
 
          Characters 94-96:
    | A (e, e') -> (eval e) e'
                            ^^
Error: This expression has type ex#0 t but an expression was expected of type
         ex#0
TagsNo tags attached.
Attached Files

- Relationships
child of 0005998assignedgarrigue GADT typing and exhaustiveness bugs 

-  Notes
(0008232)
gasche (developer)
2012-10-10 10:32

This is a known problem that I've discussed with Jacques and I think could and should be improved before the next version.

The fundamental issue is that the OCaml syntax has no way to *name* the existential variables corresponding to a GADT constructor. The name generation it uses now ("ex#" + fresh integer) is particularly obnoxious, but in any case there is a type corresponding to your variable 'a in the type of the constructor A that has no name.

My personal opinion is that not having explicit binding of those variables in the syntax (in the constructors: (A : 'a. ('a -> 'b) * ... -> 'b t), and in the matching (| A a (e, e') -> ...)) is a defect of the GADT syntax -- but syntax is about hard compromises, and here the more concise syntax was chosen.

I believe the current behavior could be improved by:
- choosing names for existential that are more natural; I would like them to include either the constructor they come from, or the original user-chosen name of the corresponding type variable in the type definition, or both
- improving the wording of the error message to avoid the annoying "has type 'foo' but was expected at type 'foo'" and possible display something GADT-specific

In the meantime, this is not a bug (the behavior is perfectly correct) but a rather shortcoming on the user interface side. And it is not "sometimes", all existential variables have those unreadable names, and all escape situations generate those unreadable error messages.

Thanks for your report anyway: it's helpful to see that people care about this.
(0008237)
garrigue (manager)
2012-10-10 11:46

This is indeed a well known problem.
Actually part of it was a bug, as there was already some code to reuse variable names from the definition, but it didn't work properly.
This part is now fixed in trunk, at revision 12998.
The new error message is:
Error: This expression has type a#0 t but an expression was expected of type
         a#0
Unfortunately, this is only a small improvement, as origin of the name is not so clear, and many existential types do not come directly from constructors.
As for the "#0" part, this is indeed to avoid name collisions.

As Gabriel mentioned, there has always been a plan to support naming of existential types in patterns.
However this requires new syntax, and this was delayed waiting for more experience.
This should be in the next release.

I'm open to improvements of the wording, but clever error reporting is very difficult.
(0008247)
garrigue (manager)
2012-10-12 03:17

I have added a patch in trunk/experimental/garrigue/pattern-local-types.diffs which allows to define types in patterns.
They are only allowed at the beginning of match cases, and the syntax is:
    | type a b c. <pattern> -> <body>
They are _different_ from local abstract types, as they are not abstract, but inferred from their uses as type annotations inside <pattern>. If the inferred type is a fresh existential type, then some special manipulation is done to rename it to the given name, so that it will appear in error messages.

Here are some examples:
(* alias *)
function type a. (() : a) -> ([] : a list);;
(function type a. (() : a) -> ([] : a list)) ();;
function type a. () -> ();; (* fail *)

(* existential *)
type t = D : 'a * ('a -> int) -> t;;
let f = function type b. D ((x:b), f) -> (f:b->int) x;;
let f = function type b. D ((x:b), f) -> (f:t->int) x;; (* fail *)
(0008249)
gasche (developer)
2012-10-12 10:19
edited on: 2012-10-12 10:21

To be clear to other potential readers of this bug item, the problem is that as there is currently no explicit binding syntax on GADT type *definitions*, the existential types are introduced in no particular order. If you define a constructor type K with two existential types 'a and 'b, we cannot use, say, the syntax (K (type a) (type b) (x, f) -> ...) because there is no notion of "first" and "second" existential type. So Jacques' choice here is to allow to introduce names for some types, that are later unified with the corresponding existential variable by additional annotations on the pattern arguments (in his example, (x:b) is necessary to link the fresh name b to the actual existential variable 'a of D).

I'm not really satisfied with this compromise, because the notations are still heavier than necessary (though arguably it helps readability to force to annotate use of the types) and we have yet another semantic distinction around type variable binding.

What about the following other compromise:
- allow the optional use of a type binding syntax at the GADT declaration site
  type t = D : 'a . 'a * ('a -> int) -> t
  (I'm not hung up on this particular syntax, for example it would be ok to have "type a . a * (a -> int) -> t" instead)
- use the "(type a)" syntax to locally introduce the type after the constructor name and before the value parameters, using the order in the definition
  function D (type a) (x, f) -> f x
- for GADTs that have not used an explicit order in their definition, reject this syntax (or possibly use the left-to-right order of occurence of the variables, but this would be a hack)

I believe this gives us a reasonable transitional path to a more principle syntax, with ultimately lighter annotations at the matching site than what is possible with your suggestion.

(Of course I'm no expert on the grammatical ambiguities here so one of the syntax proposed may actually not be an option. The important idea is having explicit type binding syntax at both the definition and use site of the constructors.)

I'm actually not so sure that this use of the "(type a)" syntax is consistent with its current use (which is important: I don't want to introduce new semantic notion). Is it the case? Or should we prefer another syntax for existential types in GADT patterns?

(Where are we on the "(type a b)" sugar? Is it currently implemented?)

(0008250)
garrigue (manager)
2012-10-12 10:56

Well, I was also a promoter of explicit syntax for existentials in GADT definitions, so I cannot say I don't agree with Gabriel for the first part.

OK, your proposal makes sense, but it has the major drawback of requiring changes in type definitions, while we still allow the old format.
So if say, you are using a library with implicit existentials, you would have no way to use them.
Well, maybe using left-to-right order is okay, but we must be very careful about avoiding bugs (the internal representation must keep the syntactic order, which is not usually true for objects and polymorphic variants for instance.)
Also there is the question of how to handle row variables.
I think your use of (type a) is consistent (you do not really abstract an external type, since the existential is already abstract, and cannot leak anyway, but otherwise this is the same thing), so at least this part is okay.
There is currently not (type a b) sugar, but this would cost nothing.

As for a transition path for a more principled syntax, I would say either this is too late, or we must decide now to change the syntax and maybe break some existing programs, because otherwise it will be too late...

If I would give one argument for my approach, I would say that while it can be in some situations lightly more verbose, it is in general pretty light-weight, thanks to the inference-based approach.
(0008262)
Sebastien Furic (reporter)
2012-10-16 11:50

And why not simply improving the error message by printing (for instance):

Error: This expression has type (type a) a t but an expression was expected of type
         (type a) a

This would explicitly tell the user that there is some existential around. After all, my surprise essentially came from the strange format of the variable name and the fact that its abstract nature was not explicit in the message.
(0009511)
gasche (developer)
2013-06-16 16:27

I'm not sure "(type a) a" more clearly indicates an existential type than "ex#23", and besides I'm not sure it's meaningful to close on all existentials (that may also be involved in some other part of the context) when printing the type.
(0009512)
gasche (developer)
2013-06-16 16:29

However, I think the error messages of the form "This expression has type <foo> but an expression was expected of type <foo>", with the same type printed at both places, are always harmful for non-experts and could be systematically skipped (or turned into a less confusing "There is a typing conflict at type <foo>", not very informative but also prints the type). I don't know whether that would be easy to implement, though.
(0009698)
frisch (developer)
2013-07-05 14:17

A generic (?) work-around for the lack of naming facility is to introduce a local function with a locally abstract type:

========================================================
type t = D : 'a * ('a -> int) -> t

module type S = sig
  type t
  val x: t
  val f: t -> int
end

let f = function
  | D (arg0, arg1) ->
    (fun (type b) (x : b) f ->
       (module (struct type t = b let x = x let f = f end) : S)
    )
    arg0 arg1
========================================================

What about considering that the introduced type names behave exactly like that, so that the code above could be written:

========================================================
let f = function
  | type b. D ((x : b), f) ->
       (module (struct type t = b let x = x let f = f end) : S)
========================================================

The syntax is the same as in Jacques' proposal, but the static semantics is different. I think it would be less confusing to the user, since locally named types currently always enforce genericity.
(0009700)
lpw25 (developer)
2013-07-05 15:45

Since, in Alain's proposal the b in

> | type b. D ((x : b), f) ->

is a regular locally abstract type (I think), could we have the syntax:

  | (type b) D ((x : b), f) ->

so that we can also allow:

  let f (type b) (D ((x: b), f)) = ...

because I often want to use a single-case GADT for existentials, and let feels more natural for them than match.
(0009701)
frisch (developer)
2013-07-05 16:51

> let f (type b) (D ((x: b), f)) = ...

This would be a change to the current interpretation of locally abstract type. Here the "body" of the "new type" construction is not generic w.r.t. the new type b.

> | type b. D ((x : b), f) ->
> is a regular locally abstract type (I think)

It is a locally abstract type "inside" the constructor (but its scope includes the right-hand side of the match clause).
(0009703)
lpw25 (developer)
2013-07-05 17:35

> This would be a change to the current interpretation of locally abstract type.

I'm not sure that it is a fundamental change to the interpretation.

    type ext = Ext : 'a -> ext

    let f (type a) (Ext(x : a)) = Ext([x] : a list)

feels very similar to:

    module type S = sig type t val x : t end;;

    let f (type a) (module X : S with type t = a) =
      (module struct type t = a list let x = [X.x] end : S)

with the addition of some check that (type a) doesn't escape the scope of the existential.
(0009704)
frisch (developer)
2013-07-05 17:53

The type of:

   let f (type a) (module X : S with type t = a) = ...

has the form: (module S with type t = 'a) -> ...

This is plain polymorphic function, there is no existential involved here. This is quite different from:

  let f (type a) (Ext(x : a)) = ...

which is monomorphic, and would be more similar to:

 let f (module X : S) = .... X.t ...

(i.e. when unpacking a module, we already get a way to refer to the existential type).
(0009706)
lpw25 (developer)
2013-07-05 18:14

> The type of:
>
> let f (type a) (module X : S with type t = a) = ...
>
> has the form: (module S with type t = 'a) -> ...

I understand that, but

  'a. (module S with type t = 'a) -> (module S)

is really equivalent to

  (module S) -> (module S)

the type system just isn't able to deduce this equality. But it could, if it
checked whether 'a appeared elsewhere in the type and then safely removed it.
This check is very similar to checking that the existential doesn't escape in:

  let f (type a) (Ext(x : a)) = ...
(0009707)
frisch (developer)
2013-07-05 18:48

> I understand that, but
> 'a. (module S with type t = 'a) -> (module S)
> is really equivalent to
> (module S) -> (module S)

The difference between the two is quite important; the existential is opened in one case by the function and in the other case by the caller. Since we are talking about naming the existential, this distinction is rather important.

With your proposal, what would the following mean?

 fun (type a) (z : a) (Ext(x : a)) -> ...
(0009708)
lpw25 (developer)
2013-07-05 21:18

> With your proposal, what would the following mean?
>
> fun (type a) (z : a) (Ext(x : a)) -> ...

That would be an error because the existential would escape its scope. It is equivalent to:

  # let f z (Ext x) =
      ignore [z; x];
      true;;
  Error: This expression has type ex#0 but an expression was expected of type
           ex#0
         The type constructor ex#0 would escape its scope

The point is

  forall A. A -> B

is equivalent to

  (exists A. A) -> B

under the condition that A is not in B.

So it is perfectly reasonable to type

  fun (exists t. (x : t)) -> ...

as

  forall t. (fun (x : t) -> ... )

as long as we check that t does not escape its scope (i.e. is not mentioned in the rest of the type). This is precisely what

  let f (type a) (Ext(x : a)) = ...

does.
(0009725)
frisch (developer)
2013-07-09 10:01

I still feel uncomfortable with the proposal, because there is an explicit binder for the existential (the GADT constructor), and the equivalence breaks this scope.
(0009731)
gasche (developer)
2013-07-09 15:07

One way to understand Leo's proposal is to say that

  fun (type a) (Ext(x : a)) -> ...

first defines a rigid variable (a), and that the patterns (Ext(x : a)) first introduces an existential variable ex#N, and then upon processing the pattern (x : a) adds the local equation (a = ex#N) in the branch of the pattern clause. We are not naming (a) the actual existential variable, but working in a context where both are equivalent, just as

  let cast (type a) (Refl : (a, int) eq) = fun (x : a) -> (x : int)

works in a local context enriched with the equation (a = int).

Currently such GADT equations are only added by an annotation on the type of the whole GADT value. This does not allow to name existentials as they precisely don't appear in the outer GADT type.

It seems this is a different mechanism from the one Jacques described for (type a . Ext (x : a)). Similar subtleties/hacks would probably be needed to get error messages to mention (a) rather than the existential names.



PS: I'm copying below the message from Oleg on caml-list about error messages for GADTs, which could be an interesting inspiration for useful error messages orthogonally to the naming of existentials discussed here.

> Just in case, here is the error message produced by GHC for the
> equivalent example.
>
> Code:
>
> {-# LANGUAGE GADTs #-}
>
> data E a where
> B :: Bool -> E Bool
> A :: E (a -> b) -> E a -> E b
>
> eval :: E a -> a
> eval (B x) = x
> eval (A f x) = (eval f) x
>
> The error message:
>
> /tmp/ga.hs:9:25:
> Couldn't match type `a1' with `E a1'
> `a1' is a rigid type variable bound by
> a pattern with constructor
> A :: forall b a. E (a -> b) -> E a -> E b,
> in an equation for `eval'
> at /tmp/ga.hs:9:7
> In the second argument of `eval', namely `x'
> In the expression: (eval f) x
> In an equation for `eval': eval (A f x) = (eval f) x
>
> The gist is using the type variable name 'a' from the definition of
> GADT, and attaching the numeric suffix to disambiguate. (GHC does that all
> the time -- and I believe OCaml does something similar in many cases,
> only not for GADTs.) What helps in particular I believe is quoting the
> clause from the GADT declaration, so we can see where the name 'a'
> comes from. So, perhaps if the error message can be improved with
> giving more information about the error, the problem will be solved
> without additional syntax?
(0011646)
garrigue (manager)
2014-06-04 08:50

I had dropped this discussion hoping that a natural solution would arise,
but unfortunately a feature that was expected to be in 4.01 will not even
be in 4.02.

Looking at the proposals so far, they all seem to be incoherent with the
current constructs (using a type construction in a pattern context, missing
notion of parameter, or incompatible scoping rules).

I've been discussing that with Jeremy Yallop, and we think we have a potential
solution:
   (Ext (x : type a. a))
with the scope of "type a" being everything to its right.

The syntax is nice, and the meaning seems clear.
And since we are on the right of x, there is no longer the problem of
rebinding the type.
There are however a few difficulties
1) originally, in
    let f : type a. typ = expr1 in expr2
  the scope of "type a" is expr1, whereas the pattern-matching equivalent
  of the above would be
    match expr1 with (f : type a. typ) -> expr2
  which gives a different scope.
  Note however that the meaning of "let f : typ = expr" is "let f = (expr : typ)",
  not "let (f : typ) = expr", so there is no formal ambiguity.
2) The original "type a. typ" is defined as syntactic sugar, and forces the
  pattern to have a really polymorphic type. Here this is not the case since
  we want to match an existential type. This could be solved by rather
  using the syntax
     (Ext (x (type a) : a)
  This is ok too, since (type a) is to the right of x.
  However, this is less readable, and this does not solve point (1),
  i.e. there is the same potential confusion with "let f (type a) = expr1 in expr2".
3) There is some interaction between these abstract types appearing inside
  patterns and type propagation. Consider this pattern:
      (... (Ext (x : type a. a)) ... : a t)
  Independently of whether it will be typable or not (in many cases it wouldn't,
  since the type would leak), the scoping rules say that the "a" in "a t" is the type
  of x. For this reason, we need an extra traversal, to check which names are,
  and if a type annotation contains names bound in a subpattern, we must delay
  building the type to after typing the subpattern.
(0011655)
lpw25 (developer)
2014-06-04 20:07

> (Ext (x : type a. a))

To me this looks too much like `x : 'a. 'a` -- i.e. a completely universal type.

My preference would still be for:

    (type a) (Ext (x : a))
(0011656)
garrigue (manager)
2014-06-04 21:26

As I explained in my previous answer
    (type a) (Ext (x : a))
is incorrect from a scoping point of view: it tries to bind an
existential that does not exist yet.

If the similarity with 'a.'a is your problem, then maybe
    (Ext (x (type a) : a))
is better.
(0011662)
gasche (developer)
2014-06-04 22:06

What about (Ext (x : type a)) ?
(0011664)
lpw25 (developer)
2014-06-04 23:16
edited on: 2014-06-04 23:21

> As I explained in my previous answer
> (type a) (Ext (x : a))
> is incorrect from a scoping point of view: it tries to bind an
> existential that does not exist yet.

Only if you think in terms of closed existential types. If we consider these to be open existential types (see "Modeling Abstract Types in Modules with Open Existential Types" by Montagu and Rémy) then:

    let (type a) (Ext (x : a)) = ext in ...

is simply

    ν a. let (Ext y) = ext in let x = open(a) y in ...

So `(type a)` is reduction and `(Ext (x : a))` is opening in the terminology of Montagu et al.

(0011666)
garrigue (manager)
2014-06-05 00:44

> What about (Ext (x : type a)) ?

type _ lam = ... | Abs : ('a -> 'b) lam * 'a lam -> 'b lam

> ? a. let (Ext y) = ext in let x = open(a) y in ...
> So `(type a)` is reduction and `(Ext (x : a))` is opening in the terminology of Montagu et al.

Currently (type a) both introduces a new type and binds it (implicitly) at the point it is introduced.
So this would clearly be a change in semantics.
(0011667)
garrigue (manager)
2014-06-05 03:33

Actually, the binder of the existential is the constructor, not the variable, so
   Ext((type a) x : a)
would be ok too.

However, the syntax for type annotations in patterns is devilish. That's what
you would have to write for Abs:
   Abs (((type a) f : a -> _), x)
Ugly isn't it?

I'm starting to think that Gabriel's idea might end up being the cleanest.
Namely, the main drawback was when there are several existential variables.
But in most cases, there is only one:
   Abs (type a) (f,x)
is non-ambiguous.
When there are several variables, as suggested before, we can use a syntax
to give explicitly the existentials in the declaration.
Also, contrary to what I though first, this approach makes it easier to
handle private rows, without complex hacks involved.
(0011669)
frisch (developer)
2014-06-05 07:00

What about my proposal:

========================================================
let f = function
  | type b. D ((x : b), f) ->
       (module (struct type t = b let x = x let f = f end) : S)
========================================================

seen as an universal quantification on the whole branch (pattern + expression).
(0011670)
garrigue (manager)
2014-06-05 07:20

Alain, I do not see the difference between your proposal and my original patch.

The problem is that

let f = function
  | type b. D ((x : b), f) ->
       (module (struct type t = b let x = x let f = f end) : S)

is essentially different from

let f = function
  | D (arg0, arg1) ->
    (fun (type b) (x : b) f ->
       (module (struct type t = b let x = x let f = f end) : S)
    )
    arg0 arg1

In the workaround code, you first pattern match on D, which introduces the existential variable in the environment, before it is matched against (type b).
In the proposal version, type b is introduced before the pattern matching, so the original scoping rules do not allow to redefine an already introduced abstract type.
I do not see how we can reconciliate the two different scoping approaches, short of introducing a new keyword.
Otherwise, (type b) has to appear after D for coherence.
(0011671)
gasche (developer)
2014-06-05 08:17

Re. explicit ordering of existential variables in GADT constructor declarations, we may use either syntaxes:

(1a) Prod : 'b 'c . 'b expr * 'c expr -> ('b * 'c) expr

and

(1b) Prod : type b c . b expr * c expr -> (b * c) expr

And conversely in pattern position,

(2a) Prod b c (e1, e2) -> ...

or (if we decide that we really want the (type a b ...) syntax)

(2b) Prod (type b c) (e1, e2) -> ...

I personally like (2b) best (it would also be the least confusing for users, in the sense that someone that doesn't know about existential type would understand that the (type b c) part can be ignored in (2b), and be confused about (2a)), but for constructor declaration I have no preference between (1a) and (1b) ((1a) is the more incremental choice though), and I think we might want to allow both.
(0011672)
frisch (developer)
2014-06-05 09:29

I'm also in favor of explicit ordering in declarations. We should probably allow 1a anyway, and I'm undecided about 1b. In pattern position, 2a would be very confusing; 2b is much better.
(0011673)
lpw25 (developer)
2014-06-05 09:47

> Currently (type a) both introduces a new type and binds it (implicitly) at the point it is introduced.
> So this would clearly be a change in semantics.

Yes, but it seems a reasonable one, and I think it is more important to avoid adding yet another type variable syntax.

The semantics seem simple enough; a newtype is either used in a (single) existential pattern which binds it, or it may escape its scope in which case it is bound when it is introduced. If someone tries to do both they get an escaping scope error, which is what they would get even if they didn't name the existential.

In a way it is not even a change in semantics. If you consider

    let f (type a) x = x in ...

to introduce a type `a` and not bind it, and

    let f (type a) (x : a) = x in ...

to introduce a type `a` and bind it to the type of the first argument then we already have a semantics where `(type a)` introduces a type and `(... : a)` binds it.
(0011674)
lpw25 (developer)
2014-06-05 10:23

> Prod : 'b 'c . 'b expr * 'c expr -> ('b * 'c) expr

This syntax should be allowed anyway (same goes for val x : 'a . 'a -> 'a).

> Prod (type b c) (e1, e2) -> ...

Personally I don't particularly like adding a form of explicit type argument, which is what this proposal amounts to. I also dislike giving the ordering of type variables a meaning. Does this means that

    type t = P : 'a 'b. 'a * 'b -> t

will be incompatible with

    type t = P : 'foo 'bar. 'bar * 'foo -> t

?
(0011676)
frisch (developer)
2014-06-05 11:04

> Does this means that
> type t = P : 'a 'b. 'a * 'b -> t
> will be incompatible with
> type t = P : 'foo 'bar. 'bar * 'foo -> t

I guess so, but why would this be a problem?
(0011677)
lpw25 (developer)
2014-06-05 11:07

Note that separating introduction from binding could also naturally support or-patterns:

    type t =
      A : 'a * ('a -> int) -> t
    | B : 'a * ('a -> int) -> t

    let f (type a) (A((x : a), g) | B((x : a) , g)) = g x
(0011678)
lpw25 (developer)
2014-06-05 11:09

> I guess so, but why would this be a problem?

It's just unexpected. In a language without explicit type specialisation the order of type arguments should be irrelevant.
(0011682)
gasche (developer)
2014-06-05 17:25

I agree that the ordering-meaningfulness is slightly unpleasant, but you can always annotate the value arguments of the pattern with the bound names, so that ordering mistakes are immediately caught.

The fact that this annotation is not mandatory means that this syntax is in general lighter.

Finally, you can also support or-patterns (but in a more verbose way):
  let f (A (type a) (x, g) | B (type a) (x, g)) = g x
The usual scoping requirement that both sides must define the same variables applies to type "variables" as well.
(0011689)
garrigue (manager)
2014-06-05 23:42
edited on: 2014-06-05 23:46

Concerning the syntax for explicit arguments, (1a) (2b) are clearly the least surprising ones, and it should always be our goal.
Note by the way that it should logically be
   Prod (type a) (type b) (x,y)
the eventual (type a b) being only a shortcut.

However, I would strongly prefer writing only existential parameters.
The idea is that the non-existential parameters are already bound by
the return type (and using a constraint is sufficient to be explicit).
So actually in prod, no parameters would be explicit, for Abs since there
is only one existential one could use the explicit syntax in the pattern without
adding anything to the definition, or write it if desired. I.e.
    Abs : ('a -> 'b) lam * 'a lam -> 'b lam
and
    Abs : 'a. ('a -> 'b) lam * 'a lam -> 'b lam
would be equivalent, and
    Abs : 'a 'b. ('a -> 'b) lam * 'a lam -> 'b lam
would be refused.
And for Abs2,
    Abs : ('a1 -> 'a2 -> 'b) * 'a1 lam * 'a2 lam -> 'b lam
would be ok but not allow explicit application,
    Abs : 'a1 'a2. ('a1 -> 'a2 -> 'b) * 'a1 lam * 'a2 lam -> 'b lam
    Abs : 'a2 'a1. ('a1 -> 'a2 -> 'b) * 'a1 lam * 'a2 lam -> 'b lam
would be distinct since they allow explicit application, and both
    Abs : 'a1. ('a1 -> 'a2 -> 'b) * 'a1 lam * 'a2 lam -> 'b lam
    Abs : 'a1 'a2 'b. ('a1 -> 'a2 -> 'b) * 'a1 lam * 'a2 lam -> 'b lam
would be refused.
We would probably also need a new optional warning for
implicit existentials in definition (might also apply to implicit
polymorphic methods).

I know that it is not like Coq, but in ocaml type variables
can be kind of hidden (row variables for instance), and having
a way to say explicitly what is existential and what is not would
be helpful, aside of the fact that having to write all variables when
only few of them are existential would be a pain.

Having a look at CamlinternalFormatBasics is good to give you an
idea of what concrete GADT constructors can be.
There are indeed a number of cases with either one or two existentials
in the definition of fmt. Note that in each case there are also at least
6 non-existential type variables, and in some cases up to 12.
So requiring to write the non-existential variables looks like a real burden.

Of course we must be future proof, so it is worth thinking about potential
situations where we would need to be explicit about non-existential
variables. I see none, but it may just be a lack of insight.

(0011691)
lpw25 (developer)
2014-06-06 09:52
edited on: 2014-06-06 12:23

I think that

    Ext(type a)(x : a)

is better than

    Ext (type a) x

because it is much more clear that `a` is being both introduced and bound.

It is also consistent with current uses of `(type a)`.

(0011695)
garrigue (manager)
2014-06-06 13:27
edited on: 2014-06-06 13:28

I kind of agree with you in principle.
However, concrete examples look more like
  Abs (type a) ((f : a -> b),(x : b))
vs
  Abs (type a) (f, x)
Note that you need both annotations, because we cannot propagate the expected type.
I wish we could at least omit the inner parentheses like in SML.
Of course, you would still be free to write the extra annotations (this is precisely the goal of introducing names).

Also, at some point it might be a good thing to recognize the specificity of existential type variables.

(0011697)
lpw25 (developer)
2014-06-06 15:48

> However, concrete examples look more like
> Abs (type a) ((f : a -> b),(x : b))
> vs
> Abs (type a) (f, x)
> Note that you need both annotations, because we cannot propagate the expected type.

Out of interest, why can't we propagate expected type?

Even if we do not propagate the expected type, surely we only need to set `a` to be equal to `ex#1`? Can't this be inferred from a single annotation, similar to how GADT matching adds equations?

Assuming we do need both annotations, I still think it is worth being explicit. In this case I think that clarity is more important than concision, especially since the syntax will probably not be used very often.

> I wish we could at least omit the inner parentheses like in SML.

That would indeed be nice. Is there a grammar conflict, or could it be allowed?
(0011700)
garrigue (manager)
2014-06-07 02:28

> Out of interest, why can't we propagate expected type?

> Even if we do not propagate the expected type, surely we only need to set `a` to be equal to `ex#1`?
> Can't this be inferred from a single annotation, similar to how GADT matching adds equations?

The problem is that the semantics of locally abstract types is that they hide their original definition, rather than being equal to it.
If you look at how Pexp_newtype is typed, you will see that a fresh abstract type is introduced in the environment, then the body is typed without any expected type, and ultimately the internal and external types are made equal.
The reason it works so well is because type inference tells us all occurrences of the abstract type, so it is then sufficient to connect the two.
However doing that during pattern matching would be much more complex, since there is not only the body to type, but also the remainder of the patterns.
So the only solution I see for now would be to be fully explicit, otherwise you end up risking to mix external and internal types in the internal context, where they are incompatible.

The advantage of the positional notation is that you are no longer introducing a locally abstract type hiding something, but just naming the existential variable.

> Assuming we do need both annotations, I still think it is worth being explicit. In this case I think that clarity is more important than concision, especially since the syntax will probably not be used very often.

To me this just sounds like saying that you should annotate all your functions with their complete types.
Sure this may help catch some type errors earlier.
But it doesn't make your type system any safer (at least in this case).
And debugging types also works by adding the annotations later: once you get a type error, you may start annotating, and the more you annotate the surer you get. The fact that introducing annotations is cheap is an advantage.
(0011701)
lpw25 (developer)
2014-06-07 09:31

> The problem is that the semantics of locally abstract types is that they hide their original definition, rather than being equal to it.
...
> However doing that during pattern matching would be much more complex, since there is not only the body to type, but also the remainder of the patterns.

I can see how that would make connecting the inner and outer types during pattern matching difficult. However, I was rather suggesting connecting the inner type of `a` with the inner type of `ex#0` by adding a GADT equation (as in `add_gadt_equation` in ctype.ml).

> To me this just sounds like saying that you should annotate all your functions with their complete types.
> Sure this may help catch some type errors earlier.
> But it doesn't make your type system any safer (at least in this case).

I don't think this is the same thing. Annotating your function types is about adding clarity to a particular piece of code by being specific about what type it has. I'm talking about adding consistency to a piece of syntax by having `(type a)` mean the same thing in all contexts. With the positional approach, in some contexts it introduces a type and in others it binds a type. Sharing a syntax for these different operations is confusing.

If you know about newtypes then it is immediately clear what:

    Abs (type a) ((f : a -> _), (x : a))

means, whereas this

    Abs (type a) (f, x)

looks like an unused newtype declaration.
(0011708)
yallop (developer)
2014-06-09 12:20

>> I wish we could at least omit the inner parentheses like in SML.

> That would indeed be nice. Is there a grammar conflict, or could it be allowed?

The difficulty is that

   (x, y : t)

currently means

   ((x, y) : t)
(0011709)
garrigue (manager)
2014-06-09 14:20

> I can see how that would make connecting the inner and outer types during pattern matching difficult.
> However, I was rather suggesting connecting the inner type of `a` with the inner type of `ex#0` by adding a GADT equation (as in `add_gadt_equation` in ctype.ml).

This amounts to changing the semantics. add_gadt_equation is only used for equations enforced by the GADT itself. This is not the case here.

> I'm talking about adding consistency to a piece of syntax by having `(type a)` mean the same thing in all contexts.

It does. This is why I am so careful about the positioning of (type a).
The positional approach makes clear which existential variable is related to "a".
Then you can perfectly choose to follow the hiding semantics, since all occurrences are "hidden" together.

My point is that, to completely keep the same semantics, you need either to use a positional approach, or to be extremely verbose (try annotating camlinternalFormatBasics for fun...)

> Abs (type a) (f, x) looks like an unused newtype declaration.

Until you learn that (type a) immediately after a non-constant constructor in a pattern is a binding occurrence.
The important part is that this behavior refines the usual rule, without introducing a contradiction.

My proposal of making explicit only existentials in the declaration might be more controversial. My justification is that distinguishing existentials will actually help the user, even if this is superficially incoherent with other uses of the universal notation.
(0011712)
gasche (developer)
2014-06-09 16:19

> My proposal of making explicit only existentials
> in the declaration might be more controversial.

Indeed, I think this should be done very carefully.

One idea that I think would be reasonable is to allow people to reuse the parameter names without quantification:

type 't expr =
  | ...
  | App : type a . (a -> 't) expr * a expr -> 't expr

Note however that this means Prod's parameters would still have to be quantified:

  | Prod : type a b . a expr * b expr -> (a * b) expr

I don't understand why we could do without introducing those explicitly. Jacques says:

> The idea is that the non-existential parameters are already bound by
> the return type (and using a constraint is sufficient to be explicit).

but consider for example

type 'a expr =
| Val of 'a
| Prod : 'b expr * 'c expr -> ('b * 'c) expr

I can indeed mention the types 'b and 'c in the case where the input type is restricted to the ('b * 'c) expr case:

(* works *)
let fst = function
  | Prod (b, c) -> b
  | Val (x, y) -> Val x

(* also works *)
let fst : type b c . (b * c) expr -> b expr = function
  | Prod (b, c) -> b
  | Val (x, y) -> Val x

(* also works, though I'm not fully sure why *)
let fst : ('b * 'c) expr -> 'b expr = function
  | Prod (b, c) -> b
  | Val (x, y) -> Val x

but in the general case, I cannot get hold of the 'b, 'c variables:

(* works *)
let rec eval : type a . a expr -> a = function
  | Val v -> v
  | Prod (b, c) -> (eval b, eval c)

(* fails *)
let rec eval : type a . a expr -> a = function
  | Val v -> v
  | Prod ((b : 'b expr), (c : 'c expr)) -> (eval b, eval c)
(0011714)
garrigue (manager)
2014-06-09 23:24

Ok, I see where the confusion comes form.
Using the equational approach, this is actually

type 'a expr =
| Val of 'a
| Prod of 'b 'c. 'b expr * 'c expr constraint 'a = 'b * 'c

assuming that here the constraint is local to Prod.

So 'b and 'c are existentials too.

Allowing to reuse parameter names introduces strange scoping rules, so I'm not too enthusiastic about it.
Have to think more.
(0011746)
lpw25 (developer)
2014-06-13 19:17

>> I can see how that would make connecting the inner and outer types during pattern matching difficult.
>> However, I was rather suggesting connecting the inner type of `a` with the inner type of `ex#0` by adding a GADT equation (as in `add_gadt_equation` in ctype.ml).

> This amounts to changing the semantics. add_gadt_equation is only used for equations enforced by the GADT itself. This is not the case here.

I'm only suggesting it as a means of type-checking `Abs (type a) (f, (x : a))` without having to annotate all uses of `a`.

From this perspective, there is no observational difference between adding a GADT equation and making `ex#0` the external type of `a`, because any use of `a` which escapes the scope of the equation will result in an error anyway.

> Until you learn that (type a) immediately after a non-constant constructor in a pattern is a binding occurrence.

But that's my point, it is one more thing users must know in order to read OCaml.

- Issue History
Date Modified Username Field Change
2012-10-09 18:07 Sebastien Furic New Issue
2012-10-10 10:32 gasche Note Added: 0008232
2012-10-10 10:33 gasche Severity minor => text
2012-10-10 10:33 gasche Status new => acknowledged
2012-10-10 11:46 garrigue Note Added: 0008237
2012-10-10 11:46 garrigue Assigned To => garrigue
2012-10-10 11:46 garrigue Status acknowledged => confirmed
2012-10-12 03:17 garrigue Note Added: 0008247
2012-10-12 10:19 gasche Note Added: 0008249
2012-10-12 10:21 gasche Note Edited: 0008249 View Revisions
2012-10-12 10:56 garrigue Note Added: 0008250
2012-10-16 11:37 gasche Target Version => 4.01.0+dev
2012-10-16 11:50 Sebastien Furic Note Added: 0008262
2013-04-23 02:44 garrigue Relationship added child of 0005998
2013-06-16 16:27 gasche Note Added: 0009511
2013-06-16 16:29 gasche Note Added: 0009512
2013-07-05 14:17 frisch Note Added: 0009698
2013-07-05 15:45 lpw25 Note Added: 0009700
2013-07-05 16:51 frisch Note Added: 0009701
2013-07-05 17:35 lpw25 Note Added: 0009703
2013-07-05 17:53 frisch Note Added: 0009704
2013-07-05 18:14 lpw25 Note Added: 0009706
2013-07-05 18:48 frisch Note Added: 0009707
2013-07-05 21:18 lpw25 Note Added: 0009708
2013-07-09 10:01 frisch Note Added: 0009725
2013-07-09 15:07 gasche Note Added: 0009731
2013-08-19 15:40 doligez Target Version 4.01.0+dev => 4.02.0+dev
2014-06-04 08:50 garrigue Note Added: 0011646
2014-06-04 20:07 lpw25 Note Added: 0011655
2014-06-04 21:26 garrigue Note Added: 0011656
2014-06-04 22:06 gasche Note Added: 0011662
2014-06-04 23:16 lpw25 Note Added: 0011664
2014-06-04 23:16 lpw25 Note Edited: 0011664 View Revisions
2014-06-04 23:18 lpw25 Note Edited: 0011664 View Revisions
2014-06-04 23:20 lpw25 Note Edited: 0011664 View Revisions
2014-06-04 23:20 lpw25 Note Edited: 0011664 View Revisions
2014-06-04 23:21 lpw25 Note Edited: 0011664 View Revisions
2014-06-05 00:44 garrigue Note Added: 0011666
2014-06-05 03:33 garrigue Note Added: 0011667
2014-06-05 07:00 frisch Note Added: 0011669
2014-06-05 07:20 garrigue Note Added: 0011670
2014-06-05 08:17 gasche Note Added: 0011671
2014-06-05 09:29 frisch Note Added: 0011672
2014-06-05 09:47 lpw25 Note Added: 0011673
2014-06-05 10:23 lpw25 Note Added: 0011674
2014-06-05 11:04 frisch Note Added: 0011676
2014-06-05 11:07 lpw25 Note Added: 0011677
2014-06-05 11:09 lpw25 Note Added: 0011678
2014-06-05 17:25 gasche Note Added: 0011682
2014-06-05 23:42 garrigue Note Added: 0011689
2014-06-05 23:46 garrigue Note Edited: 0011689 View Revisions
2014-06-06 09:52 lpw25 Note Added: 0011691
2014-06-06 09:53 lpw25 Note Edited: 0011691 View Revisions
2014-06-06 12:23 lpw25 Note Edited: 0011691 View Revisions
2014-06-06 13:27 garrigue Note Added: 0011695
2014-06-06 13:28 garrigue Note Edited: 0011695 View Revisions
2014-06-06 15:48 lpw25 Note Added: 0011697
2014-06-07 02:28 garrigue Note Added: 0011700
2014-06-07 09:31 lpw25 Note Added: 0011701
2014-06-09 12:20 yallop Note Added: 0011708
2014-06-09 14:20 garrigue Note Added: 0011709
2014-06-09 16:19 gasche Note Added: 0011712
2014-06-09 23:24 garrigue Note Added: 0011714
2014-06-13 19:17 lpw25 Note Added: 0011746
2014-07-16 20:19 doligez Target Version 4.02.0+dev => 4.02.1+dev
2014-09-04 00:25 doligez Target Version 4.02.1+dev => undecided
2014-09-22 22:08 doligez Target Version undecided => after-4.02.1


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker