Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0007835OCamltypingpublic2018-08-03 13:002018-09-21 21:16
Assigned Totrefis 
PrioritynormalSeverityminorReproducibilityhave not tried
PlatformOSOS Version
Product Version 
Target VersionFixed in Version 
Summary0007835: Incorrect (existential) type scope tracking
DescriptionSince the introduction of GADTs, we've been reserving a slot for up to a
thousand existentials to be introduced during typing of pattern matches.

This is done by: [^]

The reason for this is that scoping for type constructors is tracked using their
binding time.
If that line wasn't present, then if we did introduce new types, their binding
time would be above the current level. Which means that any use of such a type
would be out of the scope of the type constructor. [^] proposes to explicitely check that we
do not introduce more than a thousand type constructors, instead of letting them
through and then potentially reporting weird (i.e. incorrect) scope errors.

The hole in that reasonning (and in the GPR linked to above) is that type
constructors are not the only identifiers created during pattern matches:
pattern variables will also imply creating some idents, for example. [^]
illustrates that.

But pattern variables are not the only thing implying the creation of idents
though, otherwise we could almost dismiss the problem with "meh, no reasonable
hand written code will ever contain more than X many pattern variables, so it's
alright" (which I assume was the rational behind the 1K limit on existential per

An example as trivial as [^]
can also get us over the limit if [Sub] contains a few too many idents. Indeed
in that example, thanks to the 1st line introducing a module alias, a
substitution on the signature of [Sub] happens during the typing of the
[M.Sub.Pack] pattern, which will generate as many fresh idents as there are
idents in [Other_module.Sub].

If you clone that repository and run "make failure OCAMLC=/path/to/trunk/ocamlc"
in the wrong-escape directory, then you'll get:

    File "", line 6, characters 11-12:
    Error: This expression has type $Pack_'a
           but an expression was expected of type 'a
           The type constructor $Pack_'a would escape its scope

The error also happens at the tip of the 4.07 branch.

This example was encountered when trying to compile janestreet's code base with
the tip of the 4.07 branch.

The reason why it wasn't a problem on 4.07.0 is that the slot happened to be
made much bigger (as was already discussed), which makes it much less likely
to create so many idents in real code.

The reason why it wasn't a problem before 4.07 is that the 1k limit was actually
just a big lie. The scope of type constructors introduced for existential
variables wasn't tracked using the ident binding time: we were looking up the
type declaration in the environment, and checking at which level it was
introduced: [^]
That is, you could have added 10k existential, and the scoping would have been
correct for each of them. [^] is an
example of this.

However, that piece of code disappeared with [^]


There are a few possible ways to fix (or minimize the issue):

1. Let's remove the double meaning associed to stamps in identifiers. Introduce
a dedicated "scope" field and rely on that.

2. Reintroduce the piece of code which disappeared. That is basically a worse
version of (1): it removes the double meaning associed to identifiers stamps but
only for existentials, and the "scope" field is not a proper field but we go and
fetch the information in the environment

3. Leave a bigger slot when typing a match. 1k is clearly too small for real
code, 100k causes an other kind of issues. How about 10k? I have a good feeling
about 10k ( [^] ).

I have started working on (1), but I am disappearing for my vacations next
Thursday, and it's unlikely that I will have a GPR opened before then. The
change is nothing too complicated but it is fairly invasive, so it takes a bit
of time.

That is: I hope (1) will be the fix chosen for 4.08, but if we want to release
4.07.1 it probably won't be ready in time, and it's also too much for a bugfix

If we want to release a 4.07.1 compiler soonish, then (2) and (3) should be

(2) is "more correct", in the sense that we're back to a world where we can have
as many existential as we want. Though it will most likely have a negative
impact on typing time (because we have to do a lookup in the environment
whenever we call [update_level]).

Personnaly I think I'd go with (3): it's the simplest change, it's unlikely to
be problematic in practice, and won't incur any performance regression compared
to 4.07.0.
That's the choice we've made internally at janestreet.
TagsNo tags attached.
Attached Files

- Relationships

-  Notes
hhugo (reporter)
2018-08-04 20:10

spoiler .. 10k is not enough.
trefis (manager)
2018-08-29 12:00

In [^] , lpw25 implemented option (2) on the 4.07 branch. [^] proposes an implementation of (1) on top of trunk.
trefis (manager)
2018-09-21 21:16 [^] has been merged.
This is now resolved.

- Issue History
Date Modified Username Field Change
2018-08-03 13:00 trefis New Issue
2018-08-04 20:10 hhugo Note Added: 0019292
2018-08-29 12:00 trefis Note Added: 0019316
2018-08-29 12:00 trefis Assigned To => trefis
2018-08-29 12:00 trefis Status new => assigned
2018-09-21 21:16 trefis Note Added: 0019379
2018-09-21 21:16 trefis Status assigned => resolved
2018-09-21 21:16 trefis Resolution open => fixed

Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker