|Anonymous | Login | Signup for a new account||2014-08-01 16:06 CEST|
|Main | My View | View Issues | Change Log | Roadmap|
|View Issue Details|
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0005809||OCaml||OCaml general||public||2012-11-05 17:40||2012-11-08 14:53|
|Priority||normal||Severity||minor||Reproducibility||have not tried|
|Target Version||Fixed in Version||4.01.0+dev|
|Summary||0005809: Generating .cmt files takes a long time, in case of type error|
|Description||When a unit does not type-check, a .cmt file is generated, with one entry per valid sub-node of the typed AST.|
I've observed a case where compiling a (quite large, but hand-written) module takes a lot of time in case of type error: about 28 seconds, mostly to generate a 8.8 Mb .cmt file. By setting OCAML_BINANNOT_WITHENV, we ask the compiler not to remove environment for AST node; interestingly, the whole compilation now takes only 0.81 seconds and generates a 2.7 Mb .cmt file (i.e. smaller than the original one, even though environments are not discarded!). Also, I've tried to remove the hash consing in Cmt_format.keep_only_summary. As expected, the size of the .cmt file increases (to 14 Mb) and compilation takes 1.19 seconds.
|Tags||No tags attached.|
edited on: 2012-11-05 17:49
I've also tried replacing hash consing with the following simpler strategy, which avoids repeated successive calls to Env.keep_only_summary on identical environment (compared with physical equality):
=================================================================== --- typing/cmt_format.ml (revision 50428) +++ typing/cmt_format.ml (working copy) @@ -76,17 +76,19 @@ try ignore (Sys.getenv "OCAML_BINANNOT_WITHENV"); false with Not_found -> true -(* Re-introduce sharing after clearing environments *) +let last_env = ref None + let env_hcons = Hashtbl.create 133 let keep_only_summary env = - let new_env = Env.keep_only_summary env in - try - Hashtbl.find env_hcons new_env - with Not_found -> - Hashtbl.add env_hcons new_env new_env; - new_env -let clear_env_hcons () = Hashtbl.clear env_hcons + match !last_env with + | Some (e0, e1) when e0 == env -> e1 + | _ -> + let new_env = Env.keep_only_summary env in + last_env := Some (env, new_env); + new_env +let clear_env_hcons () = last_env := None + module ClearEnv = TypedtreeMap.MakeMap (struct open TypedtreeMap include DefaultMapArgument ===================================================================
This gives a 8.9 Mb .cmt file, generated in 1.16 seconds.
I'm still confused as to why setting OCAML_BINANNOT_WITHENV generates a much smaller .cmt file.
The explanation for the increase of size of the .cmt file is actually quite simple: in "error mode", the cmt file lists all type-checked nodes; mapping over them to simplify the environment loses all the sharing between them. I can see two simple work-arounds:
- Do not export the children of an exported node. (When a node is to be added to the list of exported nodes, this list is first back-tracked to its content before the node was considered.)
- In error mode, never discard the environment.
- Keep sharing when mapping the AST to simplify the environment. (Likely to reduce the size of .cmt file but increase even more the compilation time.)
One should also consider displaying the error message before creating the .cmt file.
A concrete reproduction case with typing/typecore.ml from the OCaml compiler, compiled with ocamlc:
0.54s without -bin-annot
0.98s with -bin-annot -> 1.1 Mb .cmt file
0.54s with -bin-annot and OCAML_BINANNOT_WITHENV=1 -> 1.4 Mb .cmt file
0.57s with -bin-annot and the "simpler hash-consing" patch -> 1.1 Mb
Now, add a type error at the end of the file:
0.37s without -bin-annot
7.82s with -bin-annot -> 4.6 Mb .cmt file
0.39s with -bin-annot and OCAML_BINANNOT_WITHENV=1 -> 1.5 Mb .cmt file
0.54s with -bin-annot and the "simpler hash-consing" patch -> 4.8 Mb
What about always keeping the environment in the cmt file? This is much faster, and does not seem to produce huge cmt files.
If we insist on discarding the environment, I'd suggest to adopt a more efficient way to implement that, maybe with my "simpler hash-consing" patch (less sharing, slight increase of size of .cmt files, but more more efficient) and/or by making the env fields
in nodes mutable (so that discarding the environment can be done in place without loosing sharing between nodes in the error case),
or even by making the fields of the env record themselves mutable (so that all sharing can be kept without a costly hash-table).
|Keeping environments would also solve 0005814.|
|Commit 13078: remove hash-consing of environments, replaced by a much cheaper one-slot memoization.|
|Commit 13080: do not keep both an expression and its sub-expressions in a partial .cmt file (in case of type error).|
edited on: 2012-11-08 14:49
Commit 13081: do not store a structure item together with sub-components in partial cmt files.
New timings for compiling typing/typecore.ml with ocamlc (not ocamlc.opt):
0.54s without -bin-annot
0.56s with -bin-annot -> 1162282 bytes for the .cmt file
With a type error inserted at the end of the module:
0.35s without -bin-annot
0.36s with -bin-annot -> 1170623 bytes for the .cmt file
|2012-11-05 17:40||frisch||New Issue|
|2012-11-05 17:48||frisch||Note Added: 0008422|
|2012-11-05 17:49||frisch||Note Edited: 0008422||View Revisions|
|2012-11-06 09:31||frisch||Note Added: 0008424|
|2012-11-06 10:04||frisch||Note Added: 0008425|
|2012-11-06 18:22||frisch||Note Added: 0008434|
|2012-11-06 18:22||frisch||Relationship added||related to 0005814|
|2012-11-08 11:34||frisch||Note Added: 0008445|
|2012-11-08 11:55||frisch||Note Added: 0008446|
|2012-11-08 14:24||frisch||Note Added: 0008447|
|2012-11-08 14:49||frisch||Note Deleted: 0008447|
|2012-11-08 14:49||frisch||Note Added: 0008448|
|2012-11-08 14:49||frisch||Note Edited: 0008448||View Revisions|
|2012-11-08 14:53||frisch||Note Added: 0008449|
|2012-11-08 14:53||frisch||Status||new => resolved|
|2012-11-08 14:53||frisch||Fixed in Version||=> 4.01.0+dev|
|2012-11-08 14:53||frisch||Resolution||open => fixed|
|2012-11-08 14:53||frisch||Assigned To||=> frisch|
|Copyright © 2000 - 2011 MantisBT Group|