Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0007357OCaml-Ocaml optimizationpublic2016-09-14 15:402016-12-21 14:40
Reporterhongboz 
Assigned Tofrisch 
PrioritynormalSeverityminorReproducibilityhave not tried
StatusresolvedResolutionfixed 
PlatformOSOS Version
Product Version 
Target Version4.05.0+devFixed in Version4.05.0+dev 
Summary0007357: significant compilation time increased after minor tweaks
Descriptionpreviously I have one big file `whole_compiler.ml` and a dummy `whole_compiler.mli`, now I changed the implementation of `whole_compiler.ml` to get rid of the dummy interface as below:

include (struct
(* nothing changed for the old code *)
end : sig end)

The compilation time doubled from 19s to 38s for native backend. (4.02.3)

Tagsgithub
Attached Files

- Relationships

-  Notes
(0016300)
frisch (developer)
2016-09-14 16:22

This could be 0007067.

Can you check with the current trunk? (4.03 might be impacted by 0007302)

It would also be useful to report the result of -dtimings.
(0016366)
hongboz (developer)
2016-09-28 20:52

I will try again with 4.04 beta later this week
(0016377)
hongboz (developer)
2016-10-01 22:10
edited on: 2016-10-01 22:11

so this problem is still there against 4.04:
to reproduce:
checkout this branch:https://github.com/bloomberg/bucklescript/tree/mantis_7357 [^]
cd jscomp
jscomp>time ../../ocaml-wk/bin/ocamlopt.opt -dtimings -w -a -I bin ./bin/config_whole_compiler.mli ./bin/config_whole_compiler.ml ./bin/whole_compiler2.ml -o bin/bsc.exe
all: 31.103s
parsing(./bin/config_whole_compiler.mli): 0.000s
parsing(./bin/config_whole_compiler.ml): 0.000s
typing(./bin/config_whole_compiler.ml): 0.003s
transl(./bin/config_whole_compiler.ml): 0.000s
generate(./bin/config_whole_compiler.ml): 0.005s
cmm(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
compile_phrases(sourcefile(./bin/config_whole_compiler.ml)): 0.002s
selection(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
comballoc(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
cse(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
deadcode(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
spill(sourcefile(./bin/config_whole_compiler.ml)): 0.001s
split(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
liveness(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
regalloc(sourcefile(./bin/config_whole_compiler.ml)): 0.001s
linearize(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
scheduling(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
emit(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
assemble(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
parsing(./bin/whole_compiler2.ml): 0.394s
typing(./bin/whole_compiler2.ml): 2.176s
transl(./bin/whole_compiler2.ml): 0.457s
generate(./bin/whole_compiler2.ml): 25.250s
cmm(sourcefile(./bin/whole_compiler2.ml)): 0.287s
compile_phrases(sourcefile(./bin/whole_compiler2.ml)): 23.886s
selection(sourcefile(./bin/whole_compiler2.ml)): 0.224s
comballoc(sourcefile(./bin/whole_compiler2.ml)): 0.030s
cse(sourcefile(./bin/whole_compiler2.ml)): 0.158s
deadcode(sourcefile(./bin/whole_compiler2.ml)): 0.064s
spill(sourcefile(./bin/whole_compiler2.ml)): 0.444s
split(sourcefile(./bin/whole_compiler2.ml)): 0.194s
liveness(sourcefile(./bin/whole_compiler2.ml)): 0.294s
regalloc(sourcefile(./bin/whole_compiler2.ml)): 22.123s
linearize(sourcefile(./bin/whole_compiler2.ml)): 0.036s
scheduling(sourcefile(./bin/whole_compiler2.ml)): 0.003s
emit(sourcefile(./bin/whole_compiler2.ml)): 0.187s
assemble(sourcefile(./bin/whole_compiler2.ml)): 0.001s
selection(startup): 0.002s
comballoc(startup): 0.000s
cse(startup): 0.001s
deadcode(startup): 0.000s
spill(startup): 0.002s
split(startup): 0.001s
liveness(startup): 0.001s
regalloc(startup): 0.023s
linearize(startup): 0.000s
scheduling(startup): 0.000s
emit(startup): 0.001s
assemble(startup): 0.000s

real 0m33.968s
user 0m33.161s
sys 0m0.652s
jscomp>time ../../ocaml-wk/bin/ocamlopt.opt -dtimings -w -a -I bin ./bin/config_whole_compiler.mli ./bin/config_whole_compiler.ml ./bin/whole_compiler.mli ./bin/whole_compiler.ml -o bin/bsc.exe
all: 15.094s
parsing(./bin/config_whole_compiler.mli): 0.000s
parsing(./bin/config_whole_compiler.ml): 0.000s
typing(./bin/config_whole_compiler.ml): 0.003s
transl(./bin/config_whole_compiler.ml): 0.000s
generate(./bin/config_whole_compiler.ml): 0.005s
cmm(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
compile_phrases(sourcefile(./bin/config_whole_compiler.ml)): 0.002s
selection(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
comballoc(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
cse(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
deadcode(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
spill(sourcefile(./bin/config_whole_compiler.ml)): 0.001s
split(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
liveness(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
regalloc(sourcefile(./bin/config_whole_compiler.ml)): 0.001s
linearize(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
scheduling(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
emit(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
assemble(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
parsing(./bin/whole_compiler.mli): 0.000s
parsing(./bin/whole_compiler.ml): 0.399s
typing(./bin/whole_compiler.ml): 2.159s
transl(./bin/whole_compiler.ml): 0.534s
generate(./bin/whole_compiler.ml): 10.522s
cmm(sourcefile(./bin/whole_compiler.ml)): 0.293s
compile_phrases(sourcefile(./bin/whole_compiler.ml)): 9.206s
selection(sourcefile(./bin/whole_compiler.ml)): 0.248s
comballoc(sourcefile(./bin/whole_compiler.ml)): 0.040s
cse(sourcefile(./bin/whole_compiler.ml)): 0.150s
deadcode(sourcefile(./bin/whole_compiler.ml)): 0.120s
spill(sourcefile(./bin/whole_compiler.ml)): 0.302s
split(sourcefile(./bin/whole_compiler.ml)): 0.137s
liveness(sourcefile(./bin/whole_compiler.ml)): 0.248s
regalloc(sourcefile(./bin/whole_compiler.ml)): 7.600s
linearize(sourcefile(./bin/whole_compiler.ml)): 0.029s
scheduling(sourcefile(./bin/whole_compiler.ml)): 0.004s
emit(sourcefile(./bin/whole_compiler.ml)): 0.199s
assemble(sourcefile(./bin/whole_compiler.ml)): 0.001s
selection(startup): 0.002s
comballoc(startup): 0.000s
cse(startup): 0.002s
deadcode(startup): 0.001s
spill(startup): 0.008s
split(startup): 0.001s
liveness(startup): 0.003s
regalloc(startup): 0.022s
linearize(startup): 0.000s
scheduling(startup): 0.000s
emit(startup): 0.001s
assemble(startup): 0.000s

real 0m17.806s
user 0m17.079s
sys 0m0.614s

(0016378)
hongboz (developer)
2016-10-01 22:14

note the bucklescript compiler is self contained (works with 4.02,4.03 and 4.04), it might be a good example to test how flambda behaves on such large file (100K loc)
(0016384)
frisch (developer)
2016-10-03 09:41
edited on: 2016-10-03 09:59

I confirm the difference between whole_compiler.ml and whole_compiler2.ml; on my machine, with a version synchronized with trunk a few weeks ago:

  - whole_compiler2.ml: 20s
  - whole_compiler.ml : 9.5s


(Enabling the -linscan allocator from GPR#375, the difference is much smaller:

  - whole_compiler2.ml : 7.5s
  - whole_compiler.ml : 7.1s
)


Another interesting note: the timings above have been obtained on Windows using a direct binary code emitter; using the normal assembly backend (msvc 32-bit port):

  - whole_compiler2.ml: 37s
  - whole_compiler.ml : 24s

It seems Microsoft's assembler does not like such huge files...

(0016385)
frisch (developer)
2016-10-03 11:47

Proposed fix https://github.com/ocaml/ocaml/pull/832. [^] Compilation time becomes similar between whole_compiler.ml and whole_compiler2.ml.
(0017034)
frisch (developer)
2016-12-21 14:40

Fixed by commit 9dbcb1ec6cdf42511513d604e2fb6186f8df2ed9.

- Issue History
Date Modified Username Field Change
2016-09-14 15:40 hongboz New Issue
2016-09-14 16:22 frisch Note Added: 0016300
2016-09-28 13:51 doligez Target Version => undecided
2016-09-28 13:51 doligez Status new => feedback
2016-09-28 20:52 hongboz Note Added: 0016366
2016-09-28 20:52 hongboz Status feedback => new
2016-10-01 22:10 hongboz Note Added: 0016377
2016-10-01 22:11 hongboz Note Edited: 0016377 View Revisions
2016-10-01 22:14 hongboz Note Added: 0016378
2016-10-03 09:41 frisch Note Added: 0016384
2016-10-03 09:48 frisch Note Edited: 0016384 View Revisions
2016-10-03 09:59 frisch Note Edited: 0016384 View Revisions
2016-10-03 11:47 frisch Note Added: 0016385
2016-10-05 15:12 doligez Status new => confirmed
2016-10-05 15:12 doligez Target Version undecided => 4.05.0+dev
2016-10-05 15:13 doligez Tag Attached: github
2016-12-09 08:35 shinwell Category Misc => Ocaml optimization
2016-12-21 14:40 frisch Note Added: 0017034
2016-12-21 14:40 frisch Status confirmed => resolved
2016-12-21 14:40 frisch Fixed in Version => 4.05.0+dev
2016-12-21 14:40 frisch Resolution open => fixed
2016-12-21 14:40 frisch Assigned To => frisch
2017-02-23 16:42 doligez Category Ocaml optimization => -Ocaml optimization


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker