Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0005985OCamlOCaml typingpublic2013-04-12 21:362013-12-17 02:47
Reporteryallop 
Assigned Togarrigue 
PrioritynormalSeverityfeatureReproducibilityalways
StatusconfirmedResolutionopen 
PlatformOSOS Version
Product Version 
Target VersionlaterFixed in Version 
Summary0005985: Unexpected interaction between variance and GADTs
DescriptionThe following program

   type _ t = T : 'a -> 'a s t

is accepted if s is covariant (e.g. type 'a s = 'a), contravariant (e.g. type 'a s = 'a -> unit) or invariant (e.g. type 'a s = 'a -> 'a), but rejected if s is "bivariant" (e.g. type 'a s = int):

    # type 'a s = int;;
    type 'a s = int
    # type _ t = T : 'a -> 'a s t ;;
    Characters 4-27:
      type _ t = T : 'a -> 'a s t ;;
          ^^^^^^^^^^^^^^^^^^^^^^^
    Error: In this definition, a type variable has a variance that
           is not reflected by its occurrence in type parameters.

However, if s is abstracted then the program is accepted, even if s is instantiated with a bivariant constructor:

    # include
      (functor (S : sig type 'a s end) ->
        struct
          include S
          type _ t = T : 'a -> 'a s t
        end)
        (struct type 'a s = int end)
    ;;
    type 'a s = int
    type _ t = T : 'a -> 'a s t
    #
Tagspatch
Attached Filesdiff file icon pr5985.diff [^] (3,146 bytes) 2013-04-15 11:07 [Show Content]
diff file icon pr5985-2.diff [^] (8,731 bytes) 2013-04-15 14:24 [Show Content]

- Relationships
related to 0006275resolvedgarrigue Soundness bug related to type constraints 
child of 0005984feedbackgarrigue Variance information is not properly propagated through functor applications 
child of 0005998assignedgarrigue GADT typing and exhaustiveness bugs 

-  Notes
(0009089)
garrigue (manager)
2013-04-13 13:46

Well-spotted.

This is actually a soundness problem, as you can see by adding the code:

let x = T 3;;
match x with T y -> y;;
(0009090)
lpw25 (developer)
2013-04-13 21:26

Here is a similar example:

          OCaml version 4.00.1

  # type ('a, 'b) eq = Eq: ('a, 'a) eq;;
  type ('a, 'b) eq = Eq : ('a, 'a) eq
  # module M : sig type 'a s val se: (int s, string s) eq end = struct
      type 'a s = int
      let se = Eq
    end;;
        module M : sig type 'a s val se : (int s, string s) eq end
  # type _ t = T : 'a -> 'a M.s t;;
  type _ t = T : 'a -> 'a M.s t
  # let castT (type a) (type b) (x : a t) (e: (a, b) eq) =
      ((match e with Eq -> (x : b t)) : b t);;
    val castT : 'a t -> ('a, 'b) eq -> 'b t = <fun>
  # let st : string M.s t = castT (T 3) M.se;;

  Process ocaml-toplevel segmentation fault
(0009091)
lpw25 (developer)
2013-04-14 00:19

And another:

          OCaml version 4.00.1

  # module rec A : sig type 'a s val cast : int s B.t -> string s B.t end = struct
     type 'a s = int
     let cast (x: int s B.t): string s B.t = x
    end and B : sig type _ t = T : 'a -> 'a A.s t end = struct
     type _ t = T : 'a -> 'a A.s t
    end;;
            module rec A : sig type 'a s val cast : int s B.t -> string s B.t end
  and B : sig type _ t = T : 'a -> 'a A.s t end
  # A.cast (B.T 3);;

  Process ocaml-toplevel segmentation fault
(0009092)
lpw25 (developer)
2013-04-14 10:59

Here is an example that doesn't use GADTs;

          OCaml version 4.00.1

  # include
        (functor (S : sig type 'a s end) ->
          struct
            include S
            type 'a t = 'b * ('a -> unit) constraint 'a = 'b s
            let x = ((3, fun _ -> ()) : int s t)
          end)
          (struct type 'a s = int end)
      ;;
                  type 'a s = int
  type 'a t = 'b * ('a -> unit) constraint 'a = 'b s
  val x : int s t = (<poly>, <fun>)
  # match x with (s, _) -> s ^ "hello";;

  Process ocaml-toplevel segmentation fault

In fact this example doesn't use any "language extensions", and is present in at least 3.12.1
(0009093)
gasche (developer)
2013-04-14 18:54

I worked with these "bivariant" type variables in the work on variance for GADTs ( http://hal.inria.fr/hal-00772993/ [^] ) -- we called this variance "irrelevance", but "bivariance" is rather nice. I feel guilty for not spotting these problems when I did this work, as that would have been a rather natural occasion to experiment with these edge cases. I guess I have too much faith in the type-checker :-)

Jacques, do you think you can fix those unification issues? (To innocent bystanders reading the bugtracker: bivariant types ('a t) have the distinct property that the equation (foo t = bar t) does not necessarily implies (foo = bar), while types of any other variance do.).

I think it would be rather elegant if we could have explicit support of both bivariance/irrelevance and invariance as first-class variances in the type signature language. The syntax currently only allows '+' and '-' annotations (and '=' is implicitly available), but I think most of the type-system would extend gracefully to bivariance/irrelevance. Having it explicitly could also help us reason about those corner cases, and avoid such issues in the future. There is the delicate question of syntax; we used an unicode blurb in our paper, but I daringly suggest '0' or '*'.

    type *'a = ...
    type 0'a = ...

(Is it time to reopen discussion on PR#5688? http://caml.inria.fr/mantis/view.php?id=5688 [^] . I think we properly understand the soundness question now, but it's unclear how the extended criterion we have in our article would fare without extending the OCaml signature language for upward/backward-closed types. I think it would still be useful in most actual cases, but the limitations have made me slow at asking for work on the type-checker implementation.)
(0009094)
lpw25 (developer)
2013-04-14 19:09
edited on: 2013-04-14 19:11

If we add annotations for bivariance and invariance I would suggest using existing prefix operator symbols, for example:

  type ?'a t (* Bivariant *)

  type !'a t (* Invariant *)

Note that some bivariant definitions are safe, for example

  type 'a s = S

it is only bivariant type equations that cause problems. I wonder whether there are any uses of

  type 'a s = int

that couldn't be replaced by

  type 'a s = S of int

without too much effort.

(0009095)
garrigue (manager)
2013-04-15 09:50

As Leo found by himself, this problem is not directly related to GADTs: it is about having type variables that are not determined by type parameters.
It is not related to variance either :-)

Actually this bug is probably in OCaml from the very beginning.
I verified that the following code typechecks in version 2.00.

module F(T:sig type 'a t end) = struct
  class ['a] c x = object constraint 'a = 'b T.t val x' : 'b = x method x = x' end
end;;
module M = F(struct type 'a t = unit end);;
let o = new M.c 3;;
o#x;;
- : 'a = <poly>

In some way the introduction of variance helped, since we at least check that in the current environment at the point of definition this problem does not occur. In 2.00, the following was also accepted:

type 'a t = unit;;
class ['a] c x = object constraint 'a = 'b t val x' : 'b = x method x = x' end;;

I have attached a patch that fixes those problems by assuming that all abstract types, except those that come from a separate compilation unit, may end up being bivariant.
This is fine, but it is going to break some programs, mainly (only?) those using GADTs
As you can see in the diff file, omega07.ml had to be modified, to make type indices concrete.
For instance
  type 'a succ
becomes
  type 'a succ = Succ of 'a
This may feel kind of heavy.
On the other, this is also necessary to ensure that one can prove inequalities outside of the current, so maybe this is fine.
(0009096)
garrigue (manager)
2013-04-15 10:00
edited on: 2013-04-15 10:01

Extra notes:
1) As mentioned in my previous note, this is not directly related to variance, so that extra notations for bivariance and invariance do not help here. If you want to discuss that it would be better to open another PR.

2) I wouldn't qualify this as a "child of" 5984, but indeed another way to fix it would be to recomputed variance after each signature substitution, and detect problems at this point. However, this is a rather big change, and this would mean that signature substitution can fail. I'm waiting for feedback on 5984 before doing anything in that direction.

(0009097)
gasche (developer)
2013-04-15 10:12

> I have attached a patch that fixes those problems by assuming that all abstract types, except those that come from a separate compilation unit, may end up being bivariant.

I'm not sure I understand, why is it correct to deduce (foo = bar) from (foo t = bar t) when t is defined in a separate compilation unit? Couldn't you have the exact same problem if the separate compilation unit exposed, for example, a value of type

  ('a t, 'b t) eq

?

> This is fine, but it is going to break some programs, mainly (only?) those using GADTs [...]
> For instance
> type 'a succ
becomes
> type 'a succ = Succ of 'a
>
> [...] this is also necessary to ensure that one can prove inequalities outside of the current module, so maybe this is fine.

I have already personally adopted that style after PR#5853 ( http://caml.inria.fr/mantis/view.php?id=5853 [^] ). I believe the only robust way to make sure two types are seen as incompatible is to define them as transparent incompatible definition. I have talked about this style in e.g. ( http://stackoverflow.com/questions/15721528/type-level-arithmetics-in-ocaml [^] ), maybe that would merit a point of its own in the manual.

I'm fine with type-checking changes that require that style.

I'm not sure what is the best way to expose (in)equalities between the primitive types (which I believe is you main rationale for handling abstract types structure items separately from the rest). Short of other solutions, one might want to consider a new way to define generative abstract types,

  new type 'a foo

Transparent type definitions are of two kinds, the generative ones that are guaranteed to be incompatible with all previous definitions, and the type synonyms that may be equal to previous datatypes. There is currently no way to expose the generativeness of an abstract type, and I think that's part of the problem here.

(This may also help (partially) solve PR#5361, "Syntax to specify that a custom type is never a float" http://caml.inria.fr/mantis/view.php?id=5361 [^] )
(0009098)
garrigue (manager)
2013-04-15 10:52

> I'm not sure I understand, why is it correct to deduce (foo = bar) from (foo t = bar t) when t is defined in a separate compilation unit? Couldn't you have the exact same problem if the separate compilation unit exposed, for example, a value of type
>
> ('a t, 'b t) eq

This is fine, because from (int t, bool t) eq you won't be able to deduce that int = bool (at least since my fix of last saturday, which was an independent problem).
Should still double check.

For the question of "unique" types I agree this would be nice to have a syntax.
We need a discussion somewhere.
Personally, I would prefer introducing a new syntactic category of "qualifiers", so that we
are not limited by keywords for everything. This would have many applications.

type "unique" t (* different from all other nominal types *)
type "final" 'a t (* cannot be instantiated in a signature *)

On the other hand, we don't want to make things too complicated.
(0009099)
garrigue (manager)
2013-04-15 11:09

The first patch had a stupid bug (didn't expand types when checking variance) so that it didn't work with the original report :-(
Now fixed.
I suppose I'm going to commit this, but not in the 4.00 branch.
(0009101)
lpw25 (developer)
2013-04-15 12:44

>> I'm not sure I understand, why is it correct to deduce (foo = bar) from (foo t = bar t) when t is defined in a separate compilation unit? Couldn't you have the exact same problem if the separate compilation unit exposed, for example, a value of type
>>
>> ('a t, 'b t) eq

> This is fine, because from (int t, bool t) eq you won't be able to deduce that int = bool (at least since my fix of last saturday, which was an independent problem).

But you can still deduce that int t s = bool t s, which is what causes the problem. For example, your patch still allows the following two files:
  
  (* s.ml *)

  type ('a, 'b) eq = Eq: ('a, 'a) eq

  module M : sig type 'a s val se: (int s, string s) eq end = struct
    type 'a s = int
    let se = Eq
  end

  (* t.ml *)

  type _ t = T : 'a -> 'a S.M.s t

  let castT (type a) (type b) (x : a t) (e: (a, b) S.eq) =
    ((match e with S.Eq -> (x : b t)) : b t)

  let st : string S.M.s t = castT (T 3) S.M.se

  let _ = match st with T s -> s ^ "hello"

which seg fault:

  $ ocamlc s.ml t.ml
  $ ./a.out
  Segmentation fault
(0009102)
lpw25 (developer)
2013-04-15 12:53

The problem is that you cannot allow

  sig type 'a t = int end :> sig type 'a t end

and assume that give "type 'a t"

  'a t = 'b t => 'a = 'b

I can see two possible solutions to this problem:

1) Since "bivariance" is a conservative approximation for types like:

     type 'a t = int

   We could add more annotations and restrictions to variance. For example,
   if ?'a meant bivariance and !'a meant invariance, we could have:

     'a t = 'b t => 'a = 'b iff t is invariant, covariant or contravariant

   and

     sig type ?'a t end :> sig type 'a t end

   but not

     sig type ?'a t end :> sig type !'a t end

2) We could ban type definitions of the form:

     type 'a t = typexpr

   where typexpr does not use the type variable 'a. Note that definitions like

     type 'a t = T of int

   and

     type 'a t

   would still be fine.
(0009103)
yallop (developer)
2013-04-15 13:09
edited on: 2013-04-15 13:13

It'd be better to avoid prohibiting irrelevant/bivariant type synonyms, which are both useful and common -- see e.g. the various instantiations of the Symantics signature in "Finally Tagless, Partially Evaluated":

  R.repr (p10), L.repr (p11), R.repr (p12) etc.
  http://www.cs.rutgers.edu/~ccshan/tagless/jfp.pdf [^]

or Fold_map, Rw_mutex etc. in Core

  https://github.com/janestreet/core_extended/blob/master/lib/fold_map.ml#L123 [^]
  https://github.com/janestreet/core_extended/blob/master/lib/rw_mutex.ml#L13 [^]

Insisting that all irrelevant type constructs define datatypes would break lots of existing code and make various common idioms clunky to write and slow to run.

(0009104)
garrigue (manager)
2013-04-15 14:40
edited on: 2013-04-15 14:43

Right.
I've uploaded a correct version that only allows abstract type constructors from pervasives,
but the restriction is severe.

And the option to check the variance after substitution is not good either:

module F (X : sig type 'a s val e : (int s, bool s) eq end) : sig end = struct
   type _ t = T : 'a -> 'a s t
   ...
end

is sufficient to obtain a contradiction inside the body of F, which is not
visible outside...

At this point I see indeed no easy solution, short of adding a "non-bivariant"
variance annotation. I.e. if a type is marked as non-bivariant in a signature,
it cannot be substituted (or subtyped) by a bivariant one. But this requires
changing existing code.

Surprising we completely overlooked that, while we were aware of the problem
with abstraction and injectivity...

(0009105)
lpw25 (developer)
2013-04-15 14:53

> Insisting that all irrelevant type constructs define datatypes would break lots of existing code and make various common idioms clunky to write and slow to run.

The run-time cost could be eliminated with support for something like:

  type 'a s = new int

where new is similar to private but allows coercions in either direction, so that both:

  (3 :> string s)

and

  let f (x: bool s) = (x :> int)

would be allowed.

However, it would still require existing code using these types to be rewritten in a much more verbose style.
(0009106)
gasche (developer)
2013-04-15 15:30
edited on: 2013-04-15 15:35

There is a natural order relation between the four variances,

  bivariant/irrelevant <= {co,contra}variant <= invariant

You should always allow (sig type v'a t end :> sig type w'a t end) when the variances (v) and (w) are in this order relation (v <= w).

It might be interesting to have an "injective" modality in type signature, correspondingto the ! qualifier of Leo (that is claiming that the actual implemented variance of the constructor is not bivariance), but that would require that variance inference never over-approximates the variance, and I'm not sure that's a reasonable assumption.

What breaks if we simply stop making the assumption that abstract types are injective? I think the examples of Jeremy Yallop would still be accepted (contrary to option (2) of Leo above), but without a change to the existing code you would have weaker unequalities after the type abstraction -- and a "unique" modality on abstract types would not work for type synonyms.

Note that having explicit *inequalities* is only important for code using GADTs (so probably not this existing code Jeremy is talking about), as to my knowledge it is only used for finer dead case analysis in GADTs matching.

One limitation of the current type system is that the only available way to turn a structural synonym into a nominal type (making a variant with a single constructor) has a runtime cost. It may be interesting to have a form of "nominal type synonym", and it could be naturally presented as either:

- a notion of one-constructor variants whose parameter would be erased at runtime (I think it should be a separate notion from the usual variants as the implementation probably relies on variant types having a specific runtime representation)

    type "wrap" 'a foo = Bar of blah

- or a variant of "private" types that allows subtyping in both directions, but does not produce a type equality (this has the downside of the making the subtyping relation non-antisymmetric, which I believe is possible in presence of explicit subtyping only, but a bit weird and may bite the type system in the future)

    type "equiv" 'a foo = blah

Note that any of these choice already allows to solve the injectivity problem (without introducing "injective" and/or "unique" modalities), as you can always make unicity *transparent* by exposing a nominal convertibility with an abstract type. Instead of writing

  sig
    type 'a t (* fully abstract, injectivity/unicity status unclear *)
  end

you would write

  sig
    type 'a u (* fully abstract *)
    type "equiv" 'a t = 'a u (* nominal, injective, yet cost-free *)
  end


Edit: I wrote my post simultaenously with Leo's, so indeed his "new" idea is exactly my "equiv". I'll note that the "wrap" thing is done in Haskell (which allows 'newtype' to use the variant syntax with only one constructor, with the static guarantee that there is no runtime representation change), and that I occasionally though of "equiv"/new as useful independently of this problem, to define things like "type dollar = int;; type euro = int", without the risk for genuine mistake, without the runtime cost of a one-variant constructor.

(0009107)
lpw25 (developer)
2013-04-15 15:38
edited on: 2013-04-15 15:46

> However, it would still require existing code using these types to be rewritten in a much more verbose style.

Thinking about it some more, this might not be too bad. The only reason to use a type definition like:

  type 'a t = int

is so that you can hide the equality through a module signature. This means that external code must preserve the type in the type variable, but internal code can simply ignore it.

Any type definition of the form:

  type 'a t = typexpr

where typexpr does not mention 'a can be rewritten as:

  type _t = typexpr
  type 'a t = new _t

Then the interfaces to the module can simply cast from t to _t (or vice versa). So most of the internal code can be left as it is.

(0009108)
garrigue (manager)
2013-04-15 15:43

> It might be interesting to have an "injective" modality in type signature, correspondingto the ! qualifier of Leo (that is claiming that the actual implemented variance of the constructor is not bivariance),

Actually, it would be enough to have a "do not vanish" modality, as I think Leo points too.
This has the advantage of including all concrete types (everything that is neither abstract nor alias).

> but that would require that variance inference never over-approximates the variance, and I'm not sure that's a reasonable assumption.

This just means that you must separately under-approximate.
This is what I do in my patch . I think the patch is correct, it is just probably too restrictive alone.
(0009109)
gasche (developer)
2013-04-15 15:48

Doesn't

  type 'a t = Foo

also vanish?
(0009110)
lpw25 (developer)
2013-04-15 15:58

>Doesn't
>
> type 'a t = Foo
>
>also vanish?

No, because it doesn't allow int t to be unified with bool t, which is the root of the problem.
(0009112)
gasche (developer)
2013-04-15 17:07

Indeed, thanks. They are not unifiable (therefore not equal according to GADTs), but still "equivalent" in the sense of being in the antisymmetric closure of explicit subtyping.

# type 'a t = Foo;;
# let test x = (x : int t :> bool t);;
val test : int t -> bool t = <fun>

This supports the idea that distinguishing type equality and equivalence is not a crazy idea (it's already the case in the existing system!), which improves my confidence in the idea of "equiv"/new nominal synonyms (or rather "equivalences"?).
(0009116)
yallop (developer)
2013-04-15 18:59

I wonder if we can lift the notion of value subtyping to the module level. If so, that'd certainly ease the pain of explicit coercions between types that are equivalent but not equal (i.e. Leo's "new" / Gabriel's "equiv" types). Suppose we have a signature

      module type S =
      sig
        type 'a t
        val f : int t -> string t
        ...
      end

that we want to instantiate with the following module:

      module M =
      struct
         type 'a t = new int
         let f x = x + 1
         ...
      end

As things stand we'd have to coerce each function individually in order for the ascription (M : S) to succeed:
  
      let f x = x + 1
      let (f :> int t -> string t)

However, if we extended the semantics of signature ascription to perform subtyping coercion, either directly or by adding a second form (M :> S), then this syntactic heaviness would be avoided.
(0009118)
gasche (developer)
2013-04-15 19:19

In fact, my analogy with private types does not quite work, as private types are exactly equal to their definition inside the module boundaries, and therefore not considered distinct from it even outside the boundary (which would be unsound as you could publish an explicit equality witness). My idea was to simply write (type 'a t = int) inside the boundary, but that cannot work.

So I think my idea of "equiv" type is not quite right for the purposes of having strong inequalities. Maybe Leo's idea was subtly different and is not affected.
The idea of an "erasable" one-constructor variant would still stand. But it also has the problem of being a bit painful to handle manually (as Jeremy notes in his examples).

Jeremy, I think the specifics of your proposal are a bit disappoiting. If we can have the equality ('a t = int) inside the boundary, it's unsound for the same reason as above. If not, there are still a fair amount of cases where the ascription-directed coercions will not suffice (they will only help for the types that are exposed through the interface, which in practice may end up being only a little part of the actual values flowing in the implementation).
(0009119)
lpw25 (developer)
2013-04-15 19:33

While private types are usually only used in signatures, you are allowed to create a type using private in a structure: you just won't be able to create any members of that type.

Similarly, you would use "new" within the module rather than in the signature. This solves the problem because with the following definition:

  type 'a t = new int

'a t is never equal (in unification terms) with int, but can be cast to and from int (effectively being equal in subtyping terms).

As I said above, any type defined as:

  type 'a t = typexpr (* typexpr does not contain 'a *)

could be rewritten as:

  type _t = typexpr
  type 'a t = new _t

so that _t would be used within the module and t would be used outside the module.

Jeremy's point is that this will involve having a lot of definitions of the form:

  let f: _t -> _t list = ...
  let f = (f :> 'a t -> 'a list)

This would be less annoying if it could be done automatically using the signature of the module.
(0009120)
yallop (developer)
2013-04-15 19:42

Gabriel: the idea is that the equality doesn't hold anywhere, but the equivalence holds inside the boundary, or wherever the type equation is visible. I don't think it introduces any unsoundness.

The idea is just to have a module-level analogue of subtyping, which conceptually applies :> componentwise, much like the way subtyping works for (say) object types.

   (object
     method a : [`A] = `A
     method b : [`B] = `B
   end
    :>
     < a : [`A|`B];
       b : [`B|`C]
     >)

Module-level subtyping doesn't in itself do anything solve the soundness issue discussed above; it simply makes the proposed new/equiv types palatable, drastically reducing the number of coercions in the source. Without it I'm not sure that type equivalences are really practicable.

Do you have any concerete examples where there are likely to be other values in need of coercion than those in the signature? I suspect that, as Leo says above, irrelevant type synonyms are only really useful when the equality is hidden behind a signature.
(0009121)
gasche (developer)
2013-04-15 20:54

Re. the "new" feature itself:

That would still be a departure from the way "private" is handled right now: if you define (type t = private int) in an implementation, you don't have an unequality between (t) and (int), as you can check e.g. with

  type priv = private int

  type _ t =
  | A : priv t
  | B : int t
 
  let should_be_exhaustive : int t -> unit = function
  | B -> ()
  (* exhaustiveness warning *)

I think if "new" is defined in terms of subtyping, it should really be consistent in its behavior with "private". Maybe we can change "private" so that implementation provides unequality, but I find it quite distasteful that

  (* some stuff *)
  type t = private foo (* or: new foo *)
  (* other stuff *)

would have different typing properties than

  module Parametrized (M : sig
    (* some stuff *)
    type t = private foo
  end) = struct
    include M
    (* other stuff *)
  end

Re. the signature ascription: I had an example in mind of the form

   module M : sig
     type 'a _t
     type 'a t = new 'a _t
     type 'a t_list = Nil | Cons of 'a t * 'a t_list
     val foo : 'a t_list -> int
   end = struct
     type 'a _t = int
     type 'a t = new int
     type 'a t_list = Nil | Cons of 'a t * 'a t_list
     let rec foo = function
       | Nil -> 0
       | Cons (x, xs) -> (x :> int) + foo xs
   end

In this case, the fact that I want my externally visible ('a t_list) to be defined in term of the "proper" type ('a t), the one with the good inequality guarantees, forces me to use coercions when manipulating values of type ('a t_list) internally, because I don't think it would make sense to somehow "coerce" the datatype definition.

(So ascription-derived coercion seem like a good idea, but if I've understood correctly they're more of an ad-hoc help than a fundamentally game-changing feature when evaluating the idea of "new" types, so maybe they can be discussed separately.)
(0009122)
lpw25 (developer)
2013-04-15 21:55

> That would still be a departure from the way "private" is handled right now: if you define (type t = private int) in an implementation, you don't have an unequality between (t) and (int), as you can check e.g. with
>
> type priv = private int
>
> type _ t =
> | A : priv t
> | B : int t
>
> let should_be_exhaustive : int t -> unit = function
> | B -> ()
> (* exhaustiveness warning *)

These exhaustiveness warnings are not done using equality (in the sense of unification). They use the more loosely defined "compatibility" as defined by Ctype.mcomp. In fact priv and int are not equivalent, as demonstrated by:

  # let f (x: priv) = (x: int);;
  Error: This expression has type priv but an expression was expected of type
           int
(0009124)
gasche (developer)
2013-04-15 22:44

Ok, but I think my point about

  module Parametrized (M : sig
    (* some stuff *)
    type t = private foo
  end) = struct
    include M
    (* other stuff *)
  end

still stands: if in "some stuff" you include an equality witness (t, foo) eq -- and those are definable in a module with the stronger implementation (type t = private foo) -- you can then get a context where priv and int end up being unifiable.

On the contrary, a transparent one-element variant never allows convertibility, because it is not the abstraction of a stronger definition that does. (Of course, the downside of this property is that it always require a verbose wrapping/unwrapping.)
(0009125)
lpw25 (developer)
2013-04-15 23:23
edited on: 2013-04-15 23:25

> if in "some stuff" you include an equality witness (t, foo) eq -- and those are definable in a module with the stronger implementation (type t = private foo) -- you can then get a context where priv and int end up being unifiable.

t and foo can only ever be unifiable if somewhere there is a definition

  type t = foo

An equality witness cannot be created based on type t = private foo.

Since my proposal was to ban definitions of the form:

  type 'a t = int

there is no way for int and 'a t to become unifiable (or more importantly for int t and bool t to become unifiable).

Note that I am not proposing any change to how "private" is interpreted, and that "new" is a natural extension to "private"

(0009126)
garrigue (manager)
2013-04-15 23:33

No, Gabriel.
To define (t,foo) eq, you really need to have somewhere (type t = foo), because
eq's parameters are invariant.

Actually, the (type 'a t = new foo) construct is intended to be exactly equivalent
to (type 'a t = T of foo), with the coercions corresponding to (fun x -> T x) and
(fun (T x) -> x). And the use of of coercion through signatures correspond to
inserting those coercions automatically, but only at places where you can coerce.
I.e. not at invariant positions, and index parameters in GADTs are invariant.

So I think this makes sense, and could be implemented rather easily.
Easier than real subtyping at interface level, since here we don't care about polarities,
but just whether we are at a variant or invariant position.

Of course, signature subtyping becomes incomplete, because the way we instantiate
variables may have an impact to occurences at invariant positions.
Maybe this can be solved by delaying instantiation at variant positions, so that we
handle them after invariant positions, but this still leaves the case of non-generalized
type variables.
(0009127)
gasche (developer)
2013-04-16 00:13

> To define (t,foo) eq, you really need to have somewhere (type t = foo), because
eq's parameters are invariant.

Indeed. My point was that if we allow

  module M : sig
    type t = private foo
    val eq : (t, foo) eq
  end = struct
    type t = foo
    let eq = Refl
  end

my gut feeling if that we should consider the implementation

  type t = private foo
  external eq : (t, foo) eq = "..."

to be admissible, and not make the type system unsound.

Now you may want to handle "new" differently from "private", or simply drop that property (that doesn't really exists anyway) that abstract implementations can be thought about like abstract signature of a module argument.

Said otherwise, if you allow

  module M : sig
    type t = new foo
  end = struct
    type t = foo
  end

then you cannot have (type t = new foo) verify type inequalities. If you don't allow that, they're not really consistent with how we use private types.
(0009128)
garrigue (manager)
2013-04-16 01:25
edited on: 2013-04-16 01:28

Gabriel,

> my gut feeling if that we should consider the implementation
> type t = private foo
> external eq : (t, foo) eq = "..."
> to be admissible, and not make the type system unsound.

No, please no.
Forging equality witness is the surest way to break havoc with the type system.
An equality witness is stronger than a retyping function:
  external foo_of_t : t -> foo = "%identity"
This function only allows you to convert a t into a foo if you have one at hand.
But with an equality witness, you can do that in both directions,
for any type context.
By the way, it is known that Obj.magic can break the compiler, even if the
types are right...

Further than that, private types were specifically created so that one does not
have to use retyping functions.

> If you don't allow that, they're not really consistent with how we use private types.

Right. "new" types are not private types, in that you cannot turn something else
into a "new" type. They are just a bit similar in that they look like aliases, but
are in fact nominal.

Leo,

> Since my proposal was to ban definitions of the form:
> type 'a t = int

I'm not sure of the intended meaning of this sentence, but we're clearly
not going to "ban" anything. The language is already there. But we can
allow easily usable alternative forms.

As a result, even with "new" and with signature subtyping, we still need
"don't vanish" annotations on type parameters for abstract types.
Something like
   type #'a t (* invariant not vanishing *)
   type #+'a set (* covariant not vanishing *)
   type #-'a gobject (* contravariant ... *)
Maybe we could have a compiler switch to make it the default, but then
we also need the opposite annotation "may vanish".
   
OK, lots of changes needed, so I'm going to make a branch.

(0009138)
lpw25 (developer)
2013-04-16 14:19
edited on: 2013-04-16 14:44

>> If you don't allow that, they're not really consistent with how we use private types.

>Right. "new" types are not private types, in that you cannot turn something else
into a "new" type. They are just a bit similar in that they look like aliases, but
are in fact nominal.

Actually, I think that you can safely allow:

  module M : sig
    type t = new foo
  end = struct
    type t = foo
  end

The point is that you can never have:

  type !'a t = foo (* where ! means "is injective" *)

so any definition

  type !'a t = new foo

must have been defined with "new" in all contexts.

>As a result, even with "new" and with signature subtyping, we still need
"don't vanish" annotations on type parameters for abstract types.
>Something like
> type #'a t (* invariant not vanishing *)
> type #+'a set (* covariant not vanishing *)
> type #-'a gobject (* contravariant ... *)
>Maybe we could have a compiler switch to make it the default, but then
we also need the opposite annotation "may vanish".

Perhaps it would be best to add "may vanish"/"may not be injective" (I suggest ?'a) and then say that:

  type 'a t = int (* here injectivity is inferred, like variance *)

can only be abstracted as

  type ?'a t

I think that this will need fewer annotations than using an "is injective" annotation. We should only have to annotate non-injective types and the functors that are used with them. Whereas, we would have to add "is injective" annotations to any type that might be used with a GADT or constraint.

(0009141)
gasche (developer)
2013-04-16 14:50
edited on: 2013-04-16 14:51

Leo: you are saying that abstracting an existing type definition with a "new" does not make it injective, but that *implementations* defined with are injective (and this can be preserved by later signature abstractions).

My previous mental model was that "new" was the component introducing generativity/nominalism. In your model, "new in implementations" introduces generativity (and, from there, injectivity), but "new in signatures" only expresses a very thin layer of abstraction (consistently with what "private" does in signatures, which is nice).

One difference between the two visions is the question of equalities betwen type expressions with *distinct* type constructors. If you have two abstract types (type 'a t) and (type 'a u), in general you cannot assume that (foo t) is different from (foo u). If you have (type 'a t2 = new 'a t) and (type 'a u2 = new 'a u), in my model you can derive that (foo t2) is distinct from (foo u2), while it is not the case in yours. This distinction is not captured by injectivity.

(0009145)
garrigue (manager)
2013-04-16 15:37
edited on: 2013-04-16 15:38

> Perhaps it would be best to add "may vanish"/"may not be injective" (I suggest ?'a)

> I think that this will need fewer annotations than using an "is injective" annotation. We should only have to annotate non-injective types and the functors that are used with them. Whereas, we would have to add "is injective" annotations to any type that might be used with a GADT or constraint.

I agree that it would require less annotations, and this is why I suggested adding a mode for that.
However, this cannot be the default, as it would break almost _all_ existing programs!

By the way I created the branch branches/non-vanishing, but at this point it only fixes the hole.
Other additions are in progress.

(0009148)
gasche (developer)
2013-04-16 16:49
edited on: 2013-04-16 17:17

After discussing the matter with Didier (he was the only person around able to decipher Jacques killing 2.00 example), I have a better understanding of the issue. Sorry, Jacques, for having been a bit slow to understand some of the subtleties!

- "injectivity", as is (not generativity/nominality, etc.), is necessary to fix the "constraint" example, whose soundness is motivated solely by an injectivity argument. Adding Jacques' #'a annotation on type parameters is the right solution for this problem.

- GADTs have a (secondary) need for type inequalities. We used to define abstract types for their phantom components, but this raises two independent issues:

  type z
  type 'a s
  type _ nat = Z : z | S : 'n nat -> ('n s) nat

  let two : (z s s) nat -> unit = function (S (S Z)) -> ()

  (1) A phantom abstract type may not be injective. This example relies on the fact that no ('n s) other than ('n = z s) is equal to (z s s), that is the injectivity of 's'. One may decide that private types introduced *in implementations* are always injective, or (my preference) allow the following as an abstract implementation

    type #'a s

  (2) Two abstract types constructors may not be distinct. This example also relies on the fact that the Z branch is dead, that is ('a s != z) for any 'a. If we really don't want to have to write

     type z = Z
     type 'a s = S of 'a

    we need to think about a way to express that two abstract types are distinct from each other, without giving their definition. This is a less pressing need than injectivitiy annotations (it doesn't affect soundness), but still important for GADTs (we want to know which inequalities are derivable). "new" (Jacques' version rather than Leo's) might be a solution to this problem. The problem with only assuming that abstract type definitions have this property (as I believe is, soundly, currently done) is that there is no way to preserve these inequalities through a signature ascription, which breaks modularity.

(0009174)
garrigue (manager)
2013-04-22 06:03

Update: branches/non-vanishing is now ready for testing.
It implements most of the ideas discussed here:

   type (#'a) t for injective type parameters
   type 'a t = new int for concrete types convertible by subtyping

Also, subtyping is allowed during module subtyping.
I have done only light testing myself, but at least the typing testsuite goes through,
and some counter-examples properly fail.
(0009180)
garrigue (manager)
2013-04-22 13:03

As a small note, you can look at the code in branches/non-vanishing/testsuite/tests/typing-gadts/omega07.ml for how to define type indexes that work even outside of the local module.
The idea is just to define them as normal datatype with one constructor.

   type ('a,'b) fk = FK

This way they are distinct from any type with a different constructor name, and all datatypes are
injective, so there is no need to add arguments to FK.

All the infrastructure with new and # is only intended for abstract types, i.e. when you don't want to reveal the internal representation of. In some way this is not strictly needed. One can always exports two types:

  type 'a t0
  type 'a t = MyT of 'a t0

This is sufficient to make t injective and distinct from other types.
This just happens to be rather heavy.

Of course, we are still to solve the problem of providing "unique" abstract types.
(0009181)
gasche (developer)
2013-04-22 13:36

> Of course, we are still to solve the problem of providing "unique" abstract types.

What about

  new type 'a t

considered semantically equivalent to

  type 'a t0
  type 'a t = new 'a t0

?

Does this have the expected properties of being provably distinct from all other known-to-be-concrete (by transparency or exhibition of new-ness) datatypes?


On the other hand, one may want (in the general case where "new" here is just understood as a marker for "uniquenes")

  module M : sig
    new type ('a, 'b) fk
  end = struct
    type ('a, 'b) fk = FK
  end

to type-check.


It would be tempting to have new-in-signature be a valid signature for both new-in-structures and (nominal) concrete datatypes.
(0009186)
garrigue (manager)
2013-04-22 15:36

I thought of something like that, say

  type 'a t = new

to keep a closer syntax.
However, the problem is not the syntax.
It's just very hard to get it right.

First problem, you cannot allow it to wrap normal datatypes, because then you wouldn't be able to distinguish new types from datatypes:

module M : sig
  type 'a t = new
  type 'a u = T
end = struct
  type 'a t = T
  type 'a u = 'a t = T
end

This can be kind of fixed by adding "new" datatypes too:

  type 'a t = new T

Here a "new" datatype is just like a normal datatype, except
that you cannot alias it, i.e. type 'a u = 'a t = T becomes illegal.

The next problem is how to prevent aliasing.
Part of the solution is that for a type to be exported as "new",
it must have been defined as "new".
This alone already eliminates "include" as it is interpreted in
terms of aliases.
We also have to prevent "copying" a module:
  module M2 = M
which could create another type sharing the same new.
We must also be careful about functors.
And I'm not sure the list is exhaustive.
For instance, dynamic types are going to be a problem
too (actually their witnesses share very similar problems)

So yes, this is the basic idea, but integrating it in the language
is not going to be that easy.
(0009187)
gasche (developer)
2013-04-22 16:40

This all looks quite tricky; in particular, the export of datatype equality makes thing less generative than we would expect them to be.

Maybe we could consider forgetting about "unique" abstract datatypes and keep the
    type ('a, 'b) fk = Fk
style for now, as long as the rest of the injectivity+new system stabilizes?

Do we need unique abstract types for anything other than the already suspicious phantom type tricks with GADTs?
(0009191)
garrigue (manager)
2013-04-23 09:56

I'm not sure what you mean by "already suspicious phantom type tricks with GADTs".
A typical example where we need unique abstract types is type witnesses:

    type _ t =
      | Tint : int t
      | Tarrow : 'a t * 'b t -> ('a -> 'b) t
      | Tarray : 'a t -> ('a array) t
      | Tqueue : 'a t -> ('a Queue.t) t

Here int and array are handled by the special case for builtin types, so we now
that they are unique. But this is not the case for Queue.t.
As a result the above type cannot be defined, even in the non-vanishing branch.
That part is easily solved by adding an injectivity annotation in queue.mli.
However, if we cannot use type annotation to discard cases in pattern matching:

   let f (type a) (x : a Queue.t t) =
      match x with Tqueue ta -> ...

becomes partial, and to make it complete we must write all cases, since Queue.t
could be absolutely anything.

Another useful point of unique abstract types is that they do not depend on
their names: you cannot happen to create two indistinguishable types just
because you gave them the same name.

But of course this is not as bad as the injectivity problem, since at least you
can complete your pattern-matching with some "assert false".
(0009194)
garrigue (manager)
2013-04-23 12:54

Just for fun, I've added abstract new types and new datatypes:

  type t = new
  type t = new A | B
  type t = new {x : int}

The idea is that an abstract new type can only abstract a new type, and since they do not allow aliasing, they are all different.
A new datatype is just like a normal datatype (no coercion needed), except that it cannot have an equation, and cannot be the target of the equation of a normal datatype.
At the module level, strengthening is disabled, so that different paths are really different types
(just copies of the original one).

At this point I have no proof of correctness whatsoever, but it might be an alternative to variance annotations (any new type is automatically injective), which additionally ensures uniqueness.
(0009195)
garrigue (manager)
2013-04-23 13:43

Unfortunately, this is broken:

module M : sig type t = new type _ is_t = I : ('a,t) comp -> 'a is_t end =
  struct type t = new int type _ is_t = I : ('a,t) comp -> 'a is_t end;;

module N = M;;

let e = M.I Eq;;
let N.I e' = e;; (* (M.t, N.t) comp *)
match e' with Diff -> false;; (* doesn't warn *)
(0009201)
garrigue (manager)
2013-04-24 06:06

There is now a new version of abstract and concrete new types in branches/abstract-new

The idea is that taking a copy of a module will turn all new types into aliases to the original definition (when there is a valid path), or just turn them into abstract types when there is no valid path.

# module M = struct type t = new int end;;
module M : sig type t = new int end
# module N = M;;
module N : sig type t = M.t end

The same thing of course happens if you path a named module to a functor, so that you cannot preserve the newness there.

For this reason, while this is a nice approach, it doesn't seem to be a full replacement for injectivity annotations: they are the only way to keep abstract types injective in functors, or when the original definition is not visible.

Again, examples are in typing-private/new.ml
(0009225)
frisch (developer)
2013-04-29 16:32

Setting Target Version to 4.01: we probably need to do something before this release, at least assuming that all parametrized abstract types (except predefined ones such as lazy, array) are non injective.
(0009235)
garrigue (manager)
2013-04-30 07:29

Fixed in trunk. (commit 13634)
Injectivity is now properly inferred, but there is no syntax for asserting injectivity of abstract types.
(0009487)
frisch (developer)
2013-06-14 13:38
edited on: 2013-06-14 13:38

Keeping open, because of "no syntax for asserting injectivity of abstract types", but this doesn't look like a blocking issue for 4.01 anymore.


- Issue History
Date Modified Username Field Change
2013-04-12 21:36 yallop New Issue
2013-04-13 13:46 garrigue Note Added: 0009089
2013-04-13 13:46 garrigue Assigned To => garrigue
2013-04-13 13:46 garrigue Status new => confirmed
2013-04-13 21:26 lpw25 Note Added: 0009090
2013-04-14 00:19 lpw25 Note Added: 0009091
2013-04-14 10:59 lpw25 Note Added: 0009092
2013-04-14 18:54 gasche Note Added: 0009093
2013-04-14 19:02 gasche Relationship added parent of 0005984
2013-04-14 19:03 gasche Relationship deleted parent of 0005984
2013-04-14 19:03 gasche Relationship added child of 0005984
2013-04-14 19:09 lpw25 Note Added: 0009094
2013-04-14 19:11 lpw25 Note Edited: 0009094 View Revisions
2013-04-15 09:50 garrigue Note Added: 0009095
2013-04-15 09:51 garrigue File Added: pr5985.diff
2013-04-15 10:00 garrigue Note Added: 0009096
2013-04-15 10:01 garrigue Note Edited: 0009096 View Revisions
2013-04-15 10:12 gasche Note Added: 0009097
2013-04-15 10:52 garrigue Note Added: 0009098
2013-04-15 11:06 garrigue File Deleted: pr5985.diff
2013-04-15 11:07 garrigue File Added: pr5985.diff
2013-04-15 11:09 garrigue Note Added: 0009099
2013-04-15 12:44 lpw25 Note Added: 0009101
2013-04-15 12:53 lpw25 Note Added: 0009102
2013-04-15 13:09 yallop Note Added: 0009103
2013-04-15 13:13 yallop Note Edited: 0009103 View Revisions
2013-04-15 14:24 garrigue File Added: pr5985-2.diff
2013-04-15 14:40 garrigue Note Added: 0009104
2013-04-15 14:43 garrigue Note Edited: 0009104 View Revisions
2013-04-15 14:53 lpw25 Note Added: 0009105
2013-04-15 15:30 gasche Note Added: 0009106
2013-04-15 15:34 gasche Note Edited: 0009106 View Revisions
2013-04-15 15:35 gasche Note Edited: 0009106 View Revisions
2013-04-15 15:38 lpw25 Note Added: 0009107
2013-04-15 15:43 garrigue Note Added: 0009108
2013-04-15 15:44 lpw25 Note Edited: 0009107 View Revisions
2013-04-15 15:46 lpw25 Note Edited: 0009107 View Revisions
2013-04-15 15:48 gasche Note Added: 0009109
2013-04-15 15:58 lpw25 Note Added: 0009110
2013-04-15 17:07 gasche Note Added: 0009112
2013-04-15 18:59 yallop Note Added: 0009116
2013-04-15 19:19 gasche Note Added: 0009118
2013-04-15 19:33 lpw25 Note Added: 0009119
2013-04-15 19:42 yallop Note Added: 0009120
2013-04-15 20:54 gasche Note Added: 0009121
2013-04-15 21:55 lpw25 Note Added: 0009122
2013-04-15 22:44 gasche Note Added: 0009124
2013-04-15 23:23 lpw25 Note Added: 0009125
2013-04-15 23:25 lpw25 Note Edited: 0009125 View Revisions
2013-04-15 23:33 garrigue Note Added: 0009126
2013-04-16 00:13 gasche Note Added: 0009127
2013-04-16 01:25 garrigue Note Added: 0009128
2013-04-16 01:28 garrigue Note Edited: 0009128 View Revisions
2013-04-16 14:19 lpw25 Note Added: 0009138
2013-04-16 14:43 lpw25 Note Edited: 0009138 View Revisions
2013-04-16 14:44 lpw25 Note Edited: 0009138 View Revisions
2013-04-16 14:50 gasche Note Added: 0009141
2013-04-16 14:51 gasche Note Edited: 0009141 View Revisions
2013-04-16 15:37 garrigue Note Added: 0009145
2013-04-16 15:38 garrigue Note Edited: 0009145 View Revisions
2013-04-16 16:49 gasche Note Added: 0009148
2013-04-16 16:58 gasche Note Edited: 0009148 View Revisions
2013-04-16 17:17 gasche Note Edited: 0009148 View Revisions
2013-04-18 14:12 gasche Relationship added related to 0005989
2013-04-18 14:20 garrigue Relationship deleted related to 0005989
2013-04-22 06:03 garrigue Note Added: 0009174
2013-04-22 13:03 garrigue Note Added: 0009180
2013-04-22 13:36 gasche Note Added: 0009181
2013-04-22 15:36 garrigue Note Added: 0009186
2013-04-22 16:40 gasche Note Added: 0009187
2013-04-23 02:06 garrigue Relationship added child of 0005998
2013-04-23 09:56 garrigue Note Added: 0009191
2013-04-23 12:54 garrigue Note Added: 0009194
2013-04-23 13:43 garrigue Note Added: 0009195
2013-04-24 06:06 garrigue Note Added: 0009201
2013-04-29 16:29 frisch Severity minor => major
2013-04-29 16:29 frisch Target Version => 4.02.0+dev
2013-04-29 16:31 frisch Target Version 4.02.0+dev => 4.01.0+dev
2013-04-29 16:32 frisch Note Added: 0009225
2013-04-30 07:29 garrigue Note Added: 0009235
2013-06-14 13:38 frisch Note Added: 0009487
2013-06-14 13:38 frisch Target Version 4.01.0+dev => 4.02.0+dev
2013-06-14 13:38 frisch Note Edited: 0009487 View Revisions
2013-07-12 18:15 doligez Target Version 4.02.0+dev => 4.01.1+dev
2013-10-10 17:45 frisch Target Version 4.01.1+dev => later
2013-10-10 17:46 frisch Severity major => feature
2013-12-16 15:17 doligez Tag Attached: patch
2013-12-17 02:47 garrigue Relationship added related to 0006275


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker