New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GADT typing regression in latest trunk vs. 4.02.3 #7016
Comments
Comment author: @garrigue Everything seems to work fine in the gadt-warnings branch, so defer this to after merging it. |
Comment author: @gasche I'm afraid it is going to be difficult to explain and justify to users the effect of equi-recursive types on GADT pattern-matching exhaustivity. In some cases, it is possible to remove the problematic case by using a fixed (non-abstract) base instantiation, but I don't know whether this is always the case. There is no debate that the compiler needs to be aware of equi-recursive when compiling pattern-matching code, but I wonder whether we couldn't differentiate for the user. Would it make sense, for example, to have two different of non-exhaustiveness warnings, one for warnings that are non-exhaustive without assuming equi-recursive types (except on objects and variants), and one for warnings that are only non-exhaustive when those are considered? |
Comment author: @garrigue There are already many cases where the compiler cannot detect exhaustiveness. Here the programmer did the natural thing, and tried to add the extra case. Specifically for recursive types, the solution could be to have a flag -norectypes that prohibits them. In the absence of -norectypes, we could as well remove the inheritance of -rectypes, since we have to deal with recursive types anyway. |
Comment author: @gasche In the format6 gadt-ization process, Benoît Vaugon and myself discovered the issue with recursive types (and in this case we couldn't change the type declaration for compatibility reasons, so we didn't look for a different, more robust encoding of the invariants), and we decided to just fail dynamically if given a combination of values that is only possible in presence of equi-recursive types. We could not have used -norectypes instead, as of course we would like -rectypes users to be able to use Printf and Scanf, and in fact they are just as unlikely as other users to use the specific "cheating" types to break the implicit GADT assumption that ('a) and ('b -> 'a) do not unify. I'm not saying that -norectypes is a bad idea (no strong opinion), but I'm afraid it would not be an option to most library writers. Well, unless we actually decide to just remove -rectypes from the language -- with -norectypes being a step towards a deprecation-and-removal process. |
Comment author: lavi Wouldn't it possible to detect (and reject) equi-recursive types like the one in #6993 ? |
Comment author: @garrigue Detecting them is hard, and it wouldn't help. |
Comment author: @gasche The commited fix of this bug, seems to be the cause of a massive slowdown of type-checking on menhir-generated table parsers (not code parsers, but table parsers, that generate code that used to be very easy for the compiler to handle). On my test case, which is a self-contained file extracted out of the parser generated by the OCaml grammar, ocamlc.opt takes 1s before this patch, 20s after This is highly problematic for my experiment of using Menhir as the parser generator for the OCaml compiler. I am attaching a self-contained program indep.ml (it contains the menhir runtime library and the generated grammar) which should allow to reproduce the performance degradation. From an OCaml development repository, I do make core -j5; make opt-core -j5; make ocamlc ocamlc.opt -j5 to get the compilers, and then time ./ocamlc.opt -I stdlib -c -i /tmp/indep.ml > /dev/null to measure type-checking time. |
Comment author: @gasche Indeed, I would be very interested in Tlink compression, given that it also considerably improves the behavior on Menhir's code backend. Thanks a lot for the quick fix! |
Comment author: @gasche Here are my performance numbers: on current trunktype-checking onlymenhir --table: 0m0.715smenhir --code: 0m50.914sbytecode compilationmenhir --table: 0m0.868smenhir --code: 1m0.705socamlbuild compile: 0m11.142sBefore Jacques' fix (but after the regression)type-checking onlymenhir --table: 0m16.565smenhir --code: too longbytecode compilationmenhir --table: 0m16.718smenhir --code: too longocamlbuild compile: 0m11.168sBefore Jacques' fix, before the regressiontype-checking onlymenhir --table: 0m0.754smenhir --code: 1m37.776sbytecode compilationmenhir --table: 0m0.898smenhir --code: 1m38.023socamlbuild compile: 0m11.139s |
Comment author: @gasche Numbers indicate that adding path compression improves over trunk-before-the-regression for menhir, and makes no difference for a generic OCaml compilation time (compiling the ocamlbuild sources with ocamlc.opt, 10 times, done five times with only the best time shown). |
Comment author: @garrigue If you're working with trunk, this cannot be related to path compression, which is only in #294. The only change other than avoiding putting the Tlink's in the visited list, is that I replaced the visited list by a set. The idea is that this would be more robust for large types, and there didn't seem to be any overhead for small ones. This may explain what you observe for menhir --code, if you are indeed compiling with trunk. However, your numbers for menhir --code before the regression are a bit strange: the difference between type checking and bytecode compilation seems too small. |
Original bug ID: 7016
Reporter: mandrykin
Assigned to: @garrigue
Status: closed (set by @xavierleroy on 2017-02-16T14:16:48Z)
Resolution: fixed
Priority: normal
Severity: minor
Platform: x86_64
OS: Linux 3.19.0-30
OS Version: Ubuntu 15.04
Fixed in version: 4.03.0+dev / +beta1
Category: typing
Related to: #6993
Monitored by: @gasche @hcarty
Bug description
In the latest trunk version (rev. 16450, Git commit 12f6a53) some functions operating on GADTs are typed with spurious warnings or cause the compiler to fail with stack overflow. However, they are accepted without warnings in 4.02.3:
type (_, _) t =
| Nil : ('tl, 'tl) t
| Cons : 'a * ('b, 'tl) t -> ('a * 'b, 'tl) t
let get1 (Cons (x, ) : ( * 'a, 'a) t) = x
(* 4.02.3: <> )
(
4.03.0+trunk:
File "stack_overflow_typing.ml", line 5, characters 9-43:
Warning 8: this pattern-matching is not exhaustive.
Here is an example of a value that is not matched:
Nil
)
let get1' = function (Cons (x, ) : ( * 'a, 'a) t) -> x | Nil -> assert false
(
4.02.3:
File "stack_overflow_typing.ml", line 6, characters 59-62:
Error: This pattern matches values of type ('b * 'a, 'b * 'a) t
but a pattern was expected which matches values of type
('b * 'a, 'a) t
The type variable 'a occurs inside 'b * 'a
)
(
4.03.0+trunk:
Fatal error: exception Stack overflow
*)
File attachments
The text was updated successfully, but these errors were encountered: