New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generating .cmt files takes a long time, in case of type error #5809
Comments
Comment author: @alainfrisch I've also tried replacing hash consing with the following simpler strategy, which avoids repeated successive calls to Env.keep_only_summary on identical environment (compared with physical equality): =================================================================== --- typing/cmt_format.ml (revision 50428) +++ typing/cmt_format.ml (working copy) @@ -76,17 +76,19 @@ try ignore (Sys.getenv "OCAML_BINANNOT_WITHENV"); false with Not_found -> true -(* Re-introduce sharing after clearing environments *) +let last_env = ref None + let env_hcons = Hashtbl.create 133 let keep_only_summary env = - let new_env = Env.keep_only_summary env in - try - Hashtbl.find env_hcons new_env - with Not_found -> - Hashtbl.add env_hcons new_env new_env; - new_env -let clear_env_hcons () = Hashtbl.clear env_hcons + match !last_env with + | Some (e0, e1) when e0 == env -> e1 + | _ -> + let new_env = Env.keep_only_summary env in + last_env := Some (env, new_env); + new_env +let clear_env_hcons () = last_env := None + module ClearEnv = TypedtreeMap.MakeMap (struct open TypedtreeMap include DefaultMapArgument =================================================================== This gives a 8.9 Mb .cmt file, generated in 1.16 seconds. I'm still confused as to why setting OCAML_BINANNOT_WITHENV generates a much smaller .cmt file. |
Comment author: @alainfrisch The explanation for the increase of size of the .cmt file is actually quite simple: in "error mode", the cmt file lists all type-checked nodes; mapping over them to simplify the environment loses all the sharing between them. I can see two simple work-arounds:
One should also consider displaying the error message before creating the .cmt file. |
Comment author: @alainfrisch A concrete reproduction case with typing/typecore.ml from the OCaml compiler, compiled with ocamlc: 0.54s without -bin-annot Now, add a type error at the end of the file: 0.37s without -bin-annot What about always keeping the environment in the cmt file? This is much faster, and does not seem to produce huge cmt files. If we insist on discarding the environment, I'd suggest to adopt a more efficient way to implement that, maybe with my "simpler hash-consing" patch (less sharing, slight increase of size of .cmt files, but more more efficient) and/or by making the env fields |
Comment author: @alainfrisch Keeping environments would also solve #5814. |
Comment author: @alainfrisch Commit 13078: remove hash-consing of environments, replaced by a much cheaper one-slot memoization. |
Comment author: @alainfrisch Commit 13080: do not keep both an expression and its sub-expressions in a partial .cmt file (in case of type error). |
Comment author: @alainfrisch Commit 13081: do not store a structure item together with sub-components in partial cmt files. |
Comment author: @alainfrisch New timings for compiling typing/typecore.ml with ocamlc (not ocamlc.opt): 0.54s without -bin-annot With a type error inserted at the end of the module: 0.35s without -bin-annot |
Original bug ID: 5809
Reporter: @alainfrisch
Assigned to: @alainfrisch
Status: closed (set by @xavierleroy on 2015-12-11T18:08:22Z)
Resolution: fixed
Priority: normal
Severity: minor
Fixed in version: 4.01.0+dev
Category: ~DO NOT USE (was: OCaml general)
Related to: #5814
Monitored by: @jmeber
Bug description
When a unit does not type-check, a .cmt file is generated, with one entry per valid sub-node of the typed AST.
I've observed a case where compiling a (quite large, but hand-written) module takes a lot of time in case of type error: about 28 seconds, mostly to generate a 8.8 Mb .cmt file. By setting OCAML_BINANNOT_WITHENV, we ask the compiler not to remove environment for AST node; interestingly, the whole compilation now takes only 0.81 seconds and generates a 2.7 Mb .cmt file (i.e. smaller than the original one, even though environments are not discarded!). Also, I've tried to remove the hash consing in Cmt_format.keep_only_summary. As expected, the size of the .cmt file increases (to 14 Mb) and compilation takes 1.19 seconds.
The text was updated successfully, but these errors were encountered: