Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect (existential) type scope tracking #7835

Closed
vicuna opened this issue Aug 3, 2018 · 3 comments
Closed

Incorrect (existential) type scope tracking #7835

vicuna opened this issue Aug 3, 2018 · 3 comments
Assignees

Comments

@vicuna
Copy link

vicuna commented Aug 3, 2018

Original bug ID: 7835
Reporter: @trefis
Assigned to: @trefis
Status: resolved (set by @trefis on 2018-09-21T19:16:35Z)
Resolution: fixed
Priority: normal
Severity: minor
Category: typing
Monitored by: @hhugo @Drup

Bug description

Since the introduction of GADTs, we've been reserving a slot for up to a
thousand existentials to be introduced during typing of pattern matches.

This is done by: https://github.com/ocaml/ocaml/blob/4.00/typing/typecore.ml#L2741

The reason for this is that scoping for type constructors is tracked using their
binding time.
If that line wasn't present, then if we did introduce new types, their binding
time would be above the current level. Which means that any use of such a type
would be out of the scope of the type constructor.

#1951 proposes to explicitely check that we
do not introduce more than a thousand type constructors, instead of letting them
through and then potentially reporting weird (i.e. incorrect) scope errors.

The hole in that reasonning (and in the GPR linked to above) is that type
constructors are not the only identifiers created during pattern matches:
pattern variables will also imply creating some idents, for example.

https://github.com/trefis/existentialism/blob/master/too_many_existentials/main.ml
illustrates that.

But pattern variables are not the only thing implying the creation of idents
though, otherwise we could almost dismiss the problem with "meh, no reasonable
hand written code will ever contain more than X many pattern variables, so it's
alright" (which I assume was the rational behind the 1K limit on existential per
match).

An example as trivial as https://github.com/trefis/existentialism/blob/master/wrong-escape/main.ml
can also get us over the limit if [Sub] contains a few too many idents. Indeed
in that example, thanks to the 1st line introducing a module alias, a
substitution on the signature of [Sub] happens during the typing of the
[M.Sub.Pack] pattern, which will generate as many fresh idents as there are
idents in [Other_module.Sub].

If you clone that repository and run "make failure OCAMLC=/path/to/trunk/ocamlc"
in the wrong-escape directory, then you'll get:

File "main.ml", line 6, characters 11-12:
Error: This expression has type $Pack_'a
       but an expression was expected of type 'a
       The type constructor $Pack_'a would escape its scope

The error also happens at the tip of the 4.07 branch.

This example was encountered when trying to compile janestreet's code base with
the tip of the 4.07 branch.

The reason why it wasn't a problem on 4.07.0 is that the slot happened to be
made much bigger (as was already discussed), which makes it much less likely
to create so many idents in real code.

The reason why it wasn't a problem before 4.07 is that the 1k limit was actually
just a big lie. The scope of type constructors introduced for existential
variables wasn't tracked using the ident binding time: we were looking up the
type declaration in the environment, and checking at which level it was
introduced: https://github.com/ocaml/ocaml/blob/4.00/typing/ctype.ml#L689
That is, you could have added 10k existential, and the scoping would have been
correct for each of them.
https://github.com/trefis/existentialism/blob/master/over_999/main.ml is an
example of this.

However, that piece of code disappeared with #1609


There are a few possible ways to fix (or minimize the issue):

  1. Let's remove the double meaning associed to stamps in identifiers. Introduce
    a dedicated "scope" field and rely on that.

  2. Reintroduce the piece of code which disappeared. That is basically a worse
    version of (1): it removes the double meaning associed to identifiers stamps but
    only for existentials, and the "scope" field is not a proper field but we go and
    fetch the information in the environment

  3. Leave a bigger slot when typing a match. 1k is clearly too small for real
    code, 100k causes an other kind of issues. How about 10k? I have a good feeling
    about 10k ( https://youtu.be/8Xjr2hnOHiM?t=1m34s ).

I have started working on (1), but I am disappearing for my vacations next
Thursday, and it's unlikely that I will have a GPR opened before then. The
change is nothing too complicated but it is fairly invasive, so it takes a bit
of time.

That is: I hope (1) will be the fix chosen for 4.08, but if we want to release
4.07.1 it probably won't be ready in time, and it's also too much for a bugfix
release.

If we want to release a 4.07.1 compiler soonish, then (2) and (3) should be
considered.

(2) is "more correct", in the sense that we're back to a world where we can have
as many existential as we want. Though it will most likely have a negative
impact on typing time (because we have to do a lookup in the environment
whenever we call [update_level]).

Personnaly I think I'd go with (3): it's the simplest change, it's unlikely to
be problematic in practice, and won't incur any performance regression compared
to 4.07.0.
That's the choice we've made internally at janestreet.

@vicuna
Copy link
Author

vicuna commented Aug 4, 2018

Comment author: @hhugo

spoiler .. 10k is not enough.

@vicuna
Copy link
Author

vicuna commented Aug 29, 2018

Comment author: @trefis

In #1997 , lpw25 implemented option (2) on the 4.07 branch.

#1980 proposes an implementation of (1) on top of trunk.

@vicuna
Copy link
Author

vicuna commented Sep 21, 2018

Comment author: @trefis

#1980 has been merged.
This is now resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants