You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Original bug ID: 7835 Reporter:@trefis Assigned to:@trefis Status: resolved (set by @trefis on 2018-09-21T19:16:35Z) Resolution: fixed Priority: normal Severity: minor Category: typing Monitored by:@hhugo@Drup
Bug description
Since the introduction of GADTs, we've been reserving a slot for up to a
thousand existentials to be introduced during typing of pattern matches.
The reason for this is that scoping for type constructors is tracked using their
binding time.
If that line wasn't present, then if we did introduce new types, their binding
time would be above the current level. Which means that any use of such a type
would be out of the scope of the type constructor.
#1951 proposes to explicitely check that we
do not introduce more than a thousand type constructors, instead of letting them
through and then potentially reporting weird (i.e. incorrect) scope errors.
The hole in that reasonning (and in the GPR linked to above) is that type
constructors are not the only identifiers created during pattern matches:
pattern variables will also imply creating some idents, for example.
But pattern variables are not the only thing implying the creation of idents
though, otherwise we could almost dismiss the problem with "meh, no reasonable
hand written code will ever contain more than X many pattern variables, so it's
alright" (which I assume was the rational behind the 1K limit on existential per
match).
An example as trivial as https://github.com/trefis/existentialism/blob/master/wrong-escape/main.ml
can also get us over the limit if [Sub] contains a few too many idents. Indeed
in that example, thanks to the 1st line introducing a module alias, a
substitution on the signature of [Sub] happens during the typing of the
[M.Sub.Pack] pattern, which will generate as many fresh idents as there are
idents in [Other_module.Sub].
If you clone that repository and run "make failure OCAMLC=/path/to/trunk/ocamlc"
in the wrong-escape directory, then you'll get:
File "main.ml", line 6, characters 11-12:
Error: This expression has type $Pack_'a
but an expression was expected of type 'a
The type constructor $Pack_'a would escape its scope
The error also happens at the tip of the 4.07 branch.
This example was encountered when trying to compile janestreet's code base with
the tip of the 4.07 branch.
The reason why it wasn't a problem on 4.07.0 is that the slot happened to be
made much bigger (as was already discussed), which makes it much less likely
to create so many idents in real code.
The reason why it wasn't a problem before 4.07 is that the 1k limit was actually
just a big lie. The scope of type constructors introduced for existential
variables wasn't tracked using the ident binding time: we were looking up the
type declaration in the environment, and checking at which level it was
introduced: https://github.com/ocaml/ocaml/blob/4.00/typing/ctype.ml#L689
That is, you could have added 10k existential, and the scoping would have been
correct for each of them. https://github.com/trefis/existentialism/blob/master/over_999/main.ml is an
example of this.
However, that piece of code disappeared with #1609
There are a few possible ways to fix (or minimize the issue):
Let's remove the double meaning associed to stamps in identifiers. Introduce
a dedicated "scope" field and rely on that.
Reintroduce the piece of code which disappeared. That is basically a worse
version of (1): it removes the double meaning associed to identifiers stamps but
only for existentials, and the "scope" field is not a proper field but we go and
fetch the information in the environment
Leave a bigger slot when typing a match. 1k is clearly too small for real
code, 100k causes an other kind of issues. How about 10k? I have a good feeling
about 10k ( https://youtu.be/8Xjr2hnOHiM?t=1m34s ).
I have started working on (1), but I am disappearing for my vacations next
Thursday, and it's unlikely that I will have a GPR opened before then. The
change is nothing too complicated but it is fairly invasive, so it takes a bit
of time.
That is: I hope (1) will be the fix chosen for 4.08, but if we want to release
4.07.1 it probably won't be ready in time, and it's also too much for a bugfix
release.
If we want to release a 4.07.1 compiler soonish, then (2) and (3) should be
considered.
(2) is "more correct", in the sense that we're back to a world where we can have
as many existential as we want. Though it will most likely have a negative
impact on typing time (because we have to do a lookup in the environment
whenever we call [update_level]).
Personnaly I think I'd go with (3): it's the simplest change, it's unlikely to
be problematic in practice, and won't incur any performance regression compared
to 4.07.0.
That's the choice we've made internally at janestreet.
The text was updated successfully, but these errors were encountered:
Original bug ID: 7835
Reporter: @trefis
Assigned to: @trefis
Status: resolved (set by @trefis on 2018-09-21T19:16:35Z)
Resolution: fixed
Priority: normal
Severity: minor
Category: typing
Monitored by: @hhugo @Drup
Bug description
Since the introduction of GADTs, we've been reserving a slot for up to a
thousand existentials to be introduced during typing of pattern matches.
This is done by: https://github.com/ocaml/ocaml/blob/4.00/typing/typecore.ml#L2741
The reason for this is that scoping for type constructors is tracked using their
binding time.
If that line wasn't present, then if we did introduce new types, their binding
time would be above the current level. Which means that any use of such a type
would be out of the scope of the type constructor.
#1951 proposes to explicitely check that we
do not introduce more than a thousand type constructors, instead of letting them
through and then potentially reporting weird (i.e. incorrect) scope errors.
The hole in that reasonning (and in the GPR linked to above) is that type
constructors are not the only identifiers created during pattern matches:
pattern variables will also imply creating some idents, for example.
https://github.com/trefis/existentialism/blob/master/too_many_existentials/main.ml
illustrates that.
But pattern variables are not the only thing implying the creation of idents
though, otherwise we could almost dismiss the problem with "meh, no reasonable
hand written code will ever contain more than X many pattern variables, so it's
alright" (which I assume was the rational behind the 1K limit on existential per
match).
An example as trivial as https://github.com/trefis/existentialism/blob/master/wrong-escape/main.ml
can also get us over the limit if [Sub] contains a few too many idents. Indeed
in that example, thanks to the 1st line introducing a module alias, a
substitution on the signature of [Sub] happens during the typing of the
[M.Sub.Pack] pattern, which will generate as many fresh idents as there are
idents in [Other_module.Sub].
If you clone that repository and run "make failure OCAMLC=/path/to/trunk/ocamlc"
in the wrong-escape directory, then you'll get:
The error also happens at the tip of the 4.07 branch.
This example was encountered when trying to compile janestreet's code base with
the tip of the 4.07 branch.
The reason why it wasn't a problem on 4.07.0 is that the slot happened to be
made much bigger (as was already discussed), which makes it much less likely
to create so many idents in real code.
The reason why it wasn't a problem before 4.07 is that the 1k limit was actually
just a big lie. The scope of type constructors introduced for existential
variables wasn't tracked using the ident binding time: we were looking up the
type declaration in the environment, and checking at which level it was
introduced: https://github.com/ocaml/ocaml/blob/4.00/typing/ctype.ml#L689
That is, you could have added 10k existential, and the scoping would have been
correct for each of them.
https://github.com/trefis/existentialism/blob/master/over_999/main.ml is an
example of this.
However, that piece of code disappeared with #1609
There are a few possible ways to fix (or minimize the issue):
Let's remove the double meaning associed to stamps in identifiers. Introduce
a dedicated "scope" field and rely on that.
Reintroduce the piece of code which disappeared. That is basically a worse
version of (1): it removes the double meaning associed to identifiers stamps but
only for existentials, and the "scope" field is not a proper field but we go and
fetch the information in the environment
Leave a bigger slot when typing a match. 1k is clearly too small for real
code, 100k causes an other kind of issues. How about 10k? I have a good feeling
about 10k ( https://youtu.be/8Xjr2hnOHiM?t=1m34s ).
I have started working on (1), but I am disappearing for my vacations next
Thursday, and it's unlikely that I will have a GPR opened before then. The
change is nothing too complicated but it is fairly invasive, so it takes a bit
of time.
That is: I hope (1) will be the fix chosen for 4.08, but if we want to release
4.07.1 it probably won't be ready in time, and it's also too much for a bugfix
release.
If we want to release a 4.07.1 compiler soonish, then (2) and (3) should be
considered.
(2) is "more correct", in the sense that we're back to a world where we can have
as many existential as we want. Though it will most likely have a negative
impact on typing time (because we have to do a lookup in the environment
whenever we call [update_level]).
Personnaly I think I'd go with (3): it's the simplest change, it's unlikely to
be problematic in practice, and won't incur any performance regression compared
to 4.07.0.
That's the choice we've made internally at janestreet.
The text was updated successfully, but these errors were encountered: