Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The compilation really takes time in some cases with ocaml trunk (4.02) #6350

Closed
vicuna opened this issue Mar 20, 2014 · 18 comments
Closed

The compilation really takes time in some cases with ocaml trunk (4.02) #6350

vicuna opened this issue Mar 20, 2014 · 18 comments
Assignees

Comments

@vicuna
Copy link

vicuna commented Mar 20, 2014

Original bug ID: 6350
Reporter: jpdeplaix
Assigned to: @garrigue
Status: closed (set by @xavierleroy on 2017-02-16T14:16:14Z)
Resolution: fixed
Priority: normal
Severity: minor
Fixed in version: 4.02.0+dev
Category: typing
Tags: compiler-time-or-space-regression

Bug description

With the following « test-case », you will see that SemExtra will takes a considerable amount of time to compile (more than 40 secondes on my machine).

It seems that the problem is in the typer because the parsetree is well displayed with the -dparsetree option but not with -dtypetree.

Note also that if we remove the type annotation line 165, then it take half the time to compile (about 23 secondes).

Steps to reproduce

Sorry but Luc and I, failed to reproduce it in a small test-case.
So, our case is for this file (and all module that does an include of SemExtra.Make, applied): https://github.com/herd/herdtools/blob/master/herd/semExtra.ml

If you want to test it, you can clone the repo and execute « make -C herd » (no dependencies needed).

Additional information

Of course, in 4.01, the compilation was instant.

@vicuna
Copy link
Author

vicuna commented Mar 20, 2014

Comment author: @alainfrisch

The obvious candidate is the introduction of module aliases. Since it's easier for you to test it, can you check if the problem started to materialized at revision 14394?

@vicuna
Copy link
Author

vicuna commented Mar 20, 2014

Comment author: jpdeplaix

I just check with the revision just before 14394 and it doesn't appear.
I'm checking for the revision just after.

@vicuna
Copy link
Author

vicuna commented Mar 20, 2014

Comment author: jpdeplaix

Yes, it's definitively that. We've thought of that but we didn't managed to reproduce it (even if Luc tried his best to play with module aliases).

@vicuna
Copy link
Author

vicuna commented Mar 20, 2014

Comment author: @alainfrisch

Thanks for the confirmation! I've thus assigned the ticket to Jacques.

@vicuna
Copy link
Author

vicuna commented Mar 21, 2014

Comment author: @garrigue

That's a rather mysterious problem.
A more careful analysis shows that the problem occurs not during typing, but after
(compiling with both -i and -dtypedtree is fast, but -drawlambda is slow).
However, the generated lambda code is almost identical up to renumbering of identifiers.
So this has to be happening during the conversion from typedtree to rawlambda.

Still investigating.

@vicuna
Copy link
Author

vicuna commented Mar 21, 2014

Comment author: jpdeplaix

Weird. For me, with -i it takes around 20 seconds and for -dtypedtree and -drawlambda it takes around 50 seconds.

@vicuna
Copy link
Author

vicuna commented Mar 21, 2014

Comment author: @garrigue

Sorry. I was wrong.
The slowdown is circumscribed down to Includecore, but I didn't find the cause yet.

@vicuna
Copy link
Author

vicuna commented Mar 21, 2014

Comment author: @garrigue

Sorry again, it's not Includecore, but Includemod.signatures.

@vicuna
Copy link
Author

vicuna commented Mar 22, 2014

Comment author: @garrigue

Localized the problem to Ctype.moregen.
It has exactly the same number of calls (including recursive calls),
but is 80 times slower... very strange.

@vicuna
Copy link
Author

vicuna commented Mar 22, 2014

Comment author: @garrigue

The cause was repeated substitution in the functor application case of Env.find_module.
Solved it by adding a cache, like was already the case for components_of_functor_appl.

Fixed in trunk, revision 14482.

@vicuna
Copy link
Author

vicuna commented Mar 22, 2014

Comment author: jpdeplaix

Thanks ! Note also that the global compilation time increased a little bit (116s => 141s)

@vicuna
Copy link
Author

vicuna commented Mar 23, 2014

Comment author: @garrigue

I see.
When called from normalize_path, there is no need to do extra work when the functor result is not an alias.
Could get back the overhead this way.

Fixed at revision 14483.

@vicuna
Copy link
Author

vicuna commented Mar 23, 2014

Comment author: @garrigue

Sorry, I was wrong again.
Probably an erroneous measure.
I can still see a 10% slowdown compared to 14393, both on semExtra.ml and overall.
However, the cause is not normalize_path/find_module.
So it's going to be hard to track it down.

Note that on very big projects using lots of module aliases, like Core, we actually see
a dramatic speed up.
Would have to look at other programs too.

@vicuna
Copy link
Author

vicuna commented Mar 23, 2014

Comment author: jpdeplaix

Thanks. I just tried, and on 4.02 (with your patch) the compilation takes:

  • 136s
  • but 127s with ocamlbuild -no-ocamlfind
    compared with 117s on 4.01.

So we can see a 10s slow down (which is not really big) if we compare the program built the same way.

@vicuna
Copy link
Author

vicuna commented Mar 24, 2014

Comment author: @alainfrisch

Any reason to believe that the slowdown is related to type-checking? There have been changes in ocamlopt, which could explain a tiny compile-time overhead (but 8.5% seems a lot).

@vicuna
Copy link
Author

vicuna commented Mar 25, 2014

Comment author: @garrigue

I've done some further measurements, comparing 14393 with a fixed version of 14394 and the current trunk.
The conclusion seems to be that for semExtra.cmo there is indeed a small slowdown (less than 10%) due to the introduction of normalize_path, but for the whole compilation of herdtools, this only accounts for 2% of the slowdown, the rest (8%) coming from more recent changes, probably in the backend.
I also tried with the lablgtk trunk, and I get no slowdown at all from the introduction of module aliases,
and actually a speedup from more recent changes (maybe due to the fact I'm using ocamlopt.opt).
From that I feel like there is no much need to inquire further.

make (in herdtools)
4.01: 74s
14393: 77s
14394+optalias: 79s
14482: 85s
semExtra.cmo
4.01: 1.67s
14393: 1.67s
14394+optalias: 1.80s
14482: 1.80s

@vicuna
Copy link
Author

vicuna commented Mar 25, 2014

Comment author: @alainfrisch

Thanks for the analysis!

I'm wondering whether we should put in place some more automated way to detect performance regression of the compilers, maybe using allocation as a stable proxy for speed. One way to do it would be to measure total allocation e.g. when compiling each unit to produce ocamlc.opt, storing this value in a text file which will be compared with a "reference" (modulo some admissible error) during one of the tests in testsuite/.

@vicuna
Copy link
Author

vicuna commented Mar 25, 2014

Comment author: @garrigue

That would be clearly useful.
Currently our only way to know that something went wrong about performance is when some user shouts out...
Also lots of small changes may end up slowing the compiler more and more, without many people noticing
(this said this probably gets hidden by the move to faster machines).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants