Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Univars can escape through polymorphic variants #6744

Closed
vicuna opened this issue Jan 6, 2015 · 10 comments
Closed

Univars can escape through polymorphic variants #6744

vicuna opened this issue Jan 6, 2015 · 10 comments
Assignees

Comments

@vicuna
Copy link

vicuna commented Jan 6, 2015

Original bug ID: 6744
Reporter: @lpw25
Assigned to: @garrigue
Status: assigned (set by @garrigue on 2015-01-16T15:37:54Z)
Resolution: open
Priority: normal
Severity: minor
Target version: later
Category: typing
Related to: #7741

Bug description

When a unifying closed polymorphic variant types (e.g. [< `Foo of int]) there is no equivalent of the occur_univars check. This can allow univars to escape, for example:

# let (n : < m : 'a. [< `Foo of int] -> 'a >) = 
     object method m : 'x. [< `Foo of 'x] -> 'x = fun x -> assert false end;;
  Characters -1--1:
  let (n : < m : 'a. [< `Foo of int] -> 'a >) = 

Error: Values do not match:
         val n : < m : 'a. [< `Foo of int & 'a0 ] -> 'a >
       is not included in
         val n : < m : 'a. [< `Foo of int & 'a0 ] -> 'a >

In this example, the escaped univar doesn't get very far because it triggers an module inclusion error of the n value with itself, and I have not been able to produce an example of genuine unsoundness because of this bug. However, it is clearly not correct to allow the variable to escape like this, and could potentially be unsound.

File attachments

@vicuna
Copy link
Author

vicuna commented Jan 6, 2015

Comment author: @lpw25

In fact, unification of closed polymorphic variants containing univars behaves strangely even without an escaping univar:

  # let n = 
       object method m : 'x. [< `Foo of 'x] -> 'x = fun x -> assert false end;;
    val n : < m : 'x. [< `Foo of 'x ] -> 'x > = <obj>

  # let m : < m : 'x. [< `Foo of 'x ] -> 'x > = n;;
  Characters 44-45:
    let m : < m : 'x. [< `Foo of 'x ] -> 'x > = n;;
                                                ^
  Error: This expression has type < m : 'x. [< `Foo of 'x & 'x0 ] -> 'x >
         but an expression was expected of type 'a
         The universal variable 'x0 would escape its scope

@vicuna
Copy link
Author

vicuna commented Jan 6, 2015

Comment author: @lpw25

Here's a slightly clearer example of a univar escaping:

# let (n : 'b -> < m : 'a . ([< `Foo of int] as 'b) -> 'a >) = 
     fun x -> object method m : 'x. [< `Foo of 'x] -> 'x = fun x -> assert false end;;
  Characters -1--1:
  let (n : 'b -> < m : 'a . ([< `Foo of int] as 'b) -> 'a >) = 

Error: Values do not match:
         val n : ([< `Foo of int & 'a ] as 'b) -> < m : 'a0. 'b -> 'a0 >
       is not included in
         val n : ([< `Foo of int & 'a ] as 'b) -> < m : 'a0. 'b -> 'a0 >

Note the & 'a in the argument type. That is the univar from 'a . ([< `Foo of int] as 'b) -> 'a.

@vicuna
Copy link
Author

vicuna commented Jan 16, 2015

Comment author: @garrigue

Thanks for the report.
Here is a patch that solves partly the problem.
However, for some reason the behavior is not the same in normal and principal mode.

I'm not 100% sure which is right: in theory, all your examples should be rejected, because a universal variable cannot appear inside a non-quantified flexible variant type. I.e., what you really want to write (and is accepted) is:
let n = object
method m : 'x 'o. ([< `Foo of 'x] as 'o) -> 'x = fun x -> assert false
end;;

Yet, in practice ocaml tries to do more than that, by allowing "known fields" to contain universal variables. I.e., the implementation uses row variables rather than kinded variables, so that these fields are not seen as being "under" the row variable, but rather as "siblings". This works as expected in normal mode, but , there are some problems in principal mode. Maybe it is better to follow the theory...

@vicuna
Copy link
Author

vicuna commented Jan 17, 2015

Comment author: @garrigue

Committed the patch in trunk, at revision 15780.
Leave the PR open, as the principal mode case should be improved.

@yallop
Copy link
Member

yallop commented Apr 8, 2019

@lpw25, @garrigue: can this be closed? The examples above all seem to behave correctly now -- i.e. they are accepted or trigger a clear diagnostic "The universal variable 'x would escape its scope" as appropriate.

@gasche
Copy link
Member

gasche commented May 26, 2020

I came to this code/patch while trying to understand how occur_univar failures affect type-checking in the process of reviewing #9545. I have a hard time understanding the code, so here are some beginner questions.

      (* PR#6744 *)
      let split_univars =
        List.partition
          (fun ty -> try occur_univar !env ty; true with Unify _ -> false) in
      let (tl1',tlu1) = split_univars tl1'
      and (tl2',tlu2) = split_univars tl2' in
      begin match tlu1, tlu2 with
        [], [] -> ()
      | (tu1::tlu1), _ :: _ ->
          (* Attempt to merge all the types containing univars *)
          if not !passive_variants then
          List.iter (unify env tu1) (tlu1@tlu2)
      | (tu::_, []) | ([], tu::_) -> occur_univar !env tu
      end;

What I think the code does is that it tries to "remove" conjuncts that may contain an escaping univar. It first computes all those conjuncts on both side (this is tlu1, tlu2), and then:

  1. if one of the two sides has no univar-conjunct, we fail on an escaping variable (this is the new call to occur_univar)
  2. if the two sides have at least one univar-conjunct, then we unify all them together

Obviously the first situation is correct (it makes sense to fail if there is an escaping univar). But why would the second situation be correct?

@garrigue wrote the following:

Yet, in practice ocaml tries to do more than that, by allowing "known fields" to contain universal variables

I suppose that this is related: maybe "known field" means "a conjunct with at least one case". But I don't understand the relation to what happens in this case, because:

  • Here we are not checking whether there is at least one conjunct, but whether there is at least one univar-containing conjunct. I don't see how this relate to (my best guess as to) the definition of "known field" (why would [> `Foo of int & 'a0 ] not have a known field?)
  • As far as I can tell, after all those univar conjuncts have been unified together, they are ignored by the rest of the algorithm. In particular they are never unified with the other univar conjuncts, they are not kept in the resulting conjunct list, and there is no escape error if a univar remains there. I don't understand what this means and how it could be correct.

In which of the two situations do Leo's examples, or the examples Jacques added in the testsuite, fall into? If I understand correctly Leo's example, there is only one univar-conjunct on one side and none on the other, so it is in situation (1) which I understand. Do we have testsuite examples that correspond to situation (2)?

@garrigue
Copy link
Contributor

garrigue commented May 27, 2020

The point is that Reither is instantiated by extension. So the univars are still there, on both sides, and they have been unified, so it is safe to assume that a univar on one side is equivalent to a univar on the other side. So this is sound. Of course, this is not really principal, as we unify them too early on, but it is just one of a number of features that have not so nice interactions with universal variables.

More precisely, any scheme where each univar on one side is unified with a univar on the other side would be sound, but since there is no best choice in general, we end up unifying all the univars together. This case is supposed to be rare enough, and in meaningful cases the unification should succeed.

@garrigue
Copy link
Contributor

#9612 adds a few extra examples, but due to other bugs or limitations, we cannot really exercise all cases in this code.

gasche added a commit that referenced this issue Jun 10, 2020
fix number and add examples for #6744
@github-actions
Copy link

github-actions bot commented Jun 2, 2021

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

@github-actions github-actions bot added the Stale label Jun 2, 2021
@garrigue
Copy link
Contributor

garrigue commented Jun 3, 2021

Looks safe to close this now, after the merging of #9612 last year.

@garrigue garrigue closed this as completed Jun 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants