Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0006529OCamlOCaml generalpublic2014-08-30 17:422014-09-01 14:31
Reporterjpdeplaix 
Assigned To 
PriorityhighSeveritymajorReproducibilityalways
StatusresolvedResolutionfixed 
PlatformOSOS Version
Product Version4.02.0+beta1 / +rc1 
Target VersionFixed in Version4.02.1+dev 
Summary0006529: 4.02.0 compiles modules twice slower than 4.01
Descriptionhttps://github.com/ocaml/opam-repository/pull/2547 [^]
Steps To ReproduceJust test with any OCaml program.
TagsNo tags attached.
Attached Files

- Relationships

-  Notes
(0012061)
xleroy (administrator)
2014-08-31 15:28

Smoking gun (shown by "perf" profiling): significant time is spent in List.assoc (none in 4.01) and (consequently) significantly more time in polymorphic comparison.
(0012062)
xleroy (administrator)
2014-08-31 16:47

Fixed quadratic-time code in Consistbl.extract (commit 15169 in 4.02, 15170 on trunk). Compilation time for Why3 is down to 11.0s user time, from 15.6s in 4.02 and 8.6s in 4.01. The remaining 1.4s are unaccounted for (yet).
(0012063)
lpw25 (developer)
2014-08-31 18:45

The fix to `Consistbl.extract` is a good change and reasonable solution to the problem. However, I think that the "root cause" of the bug is elsewhere.

The quadratic code was already around most uses of `Consistbl.extract`, but it was not around the one in `Env.imports`. This suggests that `Env.imported_units` is growing very large. So the cause of the bug is probably `Env.add_imports` which should be checking that it does not add duplicates to that list.

When I wrote `Env.add_imports` I think I thought that `ps_crcs_checked` would prevent duplicates. However, this obviously only works for directly imported modules; indirectly imported modules will still be duplicated. Worse, the number of (duplicated) indirect imports is probably quadratic in the number of units in a project. Combined with the quadratic code in `Consistbl.extract` this seems to have lead to some dramatic slowdowns.
(0012064)
frisch (developer)
2014-09-01 10:41
edited on: 2014-09-01 10:56

Following Leo's advice, I've committed a change to Env (15171 on trunk, 15172 on 4.02) to (i) avoid duplicates in imported_units (using a StringSet instead of a list) and (ii) avoid repeated consistency checks for the same unit (adding a bool ref to the cache of loaded persistent units to remember which ones have already been checked).

Compiling typing/typecore.ml now takes 0.52s (0.76s before the change) with ocamlopt.opt and 2.24s (3.52s before) with ocamlopt.

Since we're discussing performances of the compiler, it's also worth noting that tweaking GC parameters can have a huge impact on them. For instance, compiling typing/typecore.ml with ocamlopt.opt takes only 0.35s (instead of 0.52s) with OCAMLRUNPARAM=s=64M.

(0012065)
jpdeplaix (reporter)
2014-09-01 11:38

Great ! Here is the new results on why3:
 * 4.01: ~38s
 * 4.02: ~77s
 * 4.02 dev: ~41s

and for js_of_ocaml:
 * 4.01: ~42s
 * 4.02: ~73s
 * 4.02 dev: ~47s

Thanks a lot ! :)
There is still a little overhead but I don't know if it is normal or not.
(0012066)
xleroy (administrator)
2014-09-01 11:46

Thanks for the retest. The approx 10% overhead that remains can be caused by many different things (e.g. ocamlopt 4.02 performs two additional optimization passes compared with 4.01) and I don't think it is worth deeper investigation. Marking this PR as resolved.
(0012067)
jpdeplaix (reporter)
2014-09-01 12:29

Just for information, when do you think the 4.02.1 can be released ?
(0012068)
lpw25 (developer)
2014-09-01 14:13

> (ii) avoid repeated consistency checks for the same unit (adding a bool ref to the cache of loaded persistent units to remember which ones have already been checked).

This is what the `ps_crcs_checked` field of `pers_struct` was supposed to do. However, it seems to have been removed. I think that a field of `pers_struct` is a bit cleaner than a separate bool ref, so perhaps `ps_crcs_checked` should be resurrected.
(0012069)
frisch (developer)
2014-09-01 14:31

Thanks Leo, I've moved the flag to the pers_struct record.

- Issue History
Date Modified Username Field Change
2014-08-30 17:42 jpdeplaix New Issue
2014-08-31 15:28 xleroy Note Added: 0012061
2014-08-31 15:28 xleroy Status new => acknowledged
2014-08-31 16:47 xleroy Note Added: 0012062
2014-08-31 18:45 lpw25 Note Added: 0012063
2014-09-01 10:41 frisch Note Added: 0012064
2014-09-01 10:56 frisch Note Edited: 0012064 View Revisions
2014-09-01 11:38 jpdeplaix Note Added: 0012065
2014-09-01 11:46 xleroy Note Added: 0012066
2014-09-01 11:46 xleroy Status acknowledged => resolved
2014-09-01 11:46 xleroy Resolution open => fixed
2014-09-01 11:46 xleroy Fixed in Version => 4.02.1+dev
2014-09-01 12:29 jpdeplaix Note Added: 0012067
2014-09-01 14:13 lpw25 Note Added: 0012068
2014-09-01 14:31 frisch Note Added: 0012069


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker