|Anonymous | Login | Signup for a new account||2017-02-23 01:15 CET|
|Main | My View | View Issues | Change Log | Roadmap|
|View Issue Details|
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0006529||OCaml||OCaml general||public||2014-08-30 17:42||2016-12-07 11:34|
|Product Version||4.02.0+beta1 / +rc1|
|Target Version||Fixed in Version||4.02.1+dev|
|Summary||0006529: 4.02.0 compiles modules twice slower than 4.01|
|Steps To Reproduce||Just test with any OCaml program.|
|Tags||No tags attached.|
Smoking gun (shown by "perf" profiling): significant time is spent in List.assoc (none in 4.01) and (consequently) significantly more time in polymorphic comparison.
Fixed quadratic-time code in Consistbl.extract (commit 15169 in 4.02, 15170 on trunk). Compilation time for Why3 is down to 11.0s user time, from 15.6s in 4.02 and 8.6s in 4.01. The remaining 1.4s are unaccounted for (yet).
The fix to `Consistbl.extract` is a good change and reasonable solution to the problem. However, I think that the "root cause" of the bug is elsewhere.
The quadratic code was already around most uses of `Consistbl.extract`, but it was not around the one in `Env.imports`. This suggests that `Env.imported_units` is growing very large. So the cause of the bug is probably `Env.add_imports` which should be checking that it does not add duplicates to that list.
When I wrote `Env.add_imports` I think I thought that `ps_crcs_checked` would prevent duplicates. However, this obviously only works for directly imported modules; indirectly imported modules will still be duplicated. Worse, the number of (duplicated) indirect imports is probably quadratic in the number of units in a project. Combined with the quadratic code in `Consistbl.extract` this seems to have lead to some dramatic slowdowns.
edited on: 2014-09-01 10:56
Following Leo's advice, I've committed a change to Env (15171 on trunk, 15172 on 4.02) to (i) avoid duplicates in imported_units (using a StringSet instead of a list) and (ii) avoid repeated consistency checks for the same unit (adding a bool ref to the cache of loaded persistent units to remember which ones have already been checked).
Compiling typing/typecore.ml now takes 0.52s (0.76s before the change) with ocamlopt.opt and 2.24s (3.52s before) with ocamlopt.
Since we're discussing performances of the compiler, it's also worth noting that tweaking GC parameters can have a huge impact on them. For instance, compiling typing/typecore.ml with ocamlopt.opt takes only 0.35s (instead of 0.52s) with OCAMLRUNPARAM=s=64M.
Great ! Here is the new results on why3:
* 4.01: ~38s
* 4.02: ~77s
* 4.02 dev: ~41s
and for js_of_ocaml:
* 4.01: ~42s
* 4.02: ~73s
* 4.02 dev: ~47s
Thanks a lot ! :)
There is still a little overhead but I don't know if it is normal or not.
Thanks for the retest. The approx 10% overhead that remains can be caused by many different things (e.g. ocamlopt 4.02 performs two additional optimization passes compared with 4.01) and I don't think it is worth deeper investigation. Marking this PR as resolved.
|Just for information, when do you think the 4.02.1 can be released ?|
> (ii) avoid repeated consistency checks for the same unit (adding a bool ref to the cache of loaded persistent units to remember which ones have already been checked).
This is what the `ps_crcs_checked` field of `pers_struct` was supposed to do. However, it seems to have been removed. I think that a field of `pers_struct` is a bit cleaner than a separate bool ref, so perhaps `ps_crcs_checked` should be resurrected.
|Thanks Leo, I've moved the flag to the pers_struct record.|
|2014-08-30 17:42||jpdeplaix||New Issue|
|2014-08-31 15:28||xleroy||Note Added: 0012061|
|2014-08-31 15:28||xleroy||Status||new => acknowledged|
|2014-08-31 16:47||xleroy||Note Added: 0012062|
|2014-08-31 18:45||lpw25||Note Added: 0012063|
|2014-09-01 10:41||frisch||Note Added: 0012064|
|2014-09-01 10:56||frisch||Note Edited: 0012064||View Revisions|
|2014-09-01 11:38||jpdeplaix||Note Added: 0012065|
|2014-09-01 11:46||xleroy||Note Added: 0012066|
|2014-09-01 11:46||xleroy||Status||acknowledged => resolved|
|2014-09-01 11:46||xleroy||Resolution||open => fixed|
|2014-09-01 11:46||xleroy||Fixed in Version||=> 4.02.1+dev|
|2014-09-01 12:29||jpdeplaix||Note Added: 0012067|
|2014-09-01 14:13||lpw25||Note Added: 0012068|
|2014-09-01 14:31||frisch||Note Added: 0012069|
|2016-12-07 11:34||xleroy||Status||resolved => closed|
|Copyright © 2000 - 2011 MantisBT Group|