New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance degradation on 4.07.0+flambda #7849
Comments
Comment author: @gasche I could reproduce the issue. You need -O3 for the optimization to kick in, -O2 is not enough. 4.06.1 also performs the expected optimization, so the regression is between 4.06 and 4.07. |
Comment author: @lpw25 I haven't investigated in detail, but from the look of that code and the bisection, I would say that this is an expected regression. That commit is there to prevent the inliner getting into an infinite loop. It is difficult to tell the difference between the code in this example and code that causes an infinite loop. I have a branch -- worked on in succession by two interns -- that rewrites how the inlining heuristics and limits work so that they have a predictable semantics. This branch should work properly on this kind of example, but it won't be ready until OCaml 4.09. |
Comment author: @lpw25
I should clarify this part before someone suggests using a timer or a big counter. That commit adjusts how we detect recursive function calls. Flambda's current semantics are supposed to limit the depth of recursive function calls to the (IIRC) "unrolling-depth" and there were cases before 4.07 where we would fail to detect the recursion. So the bug was that the unrolling depth was being ignored -- as a symptom of this bug we could also get infinite loops. In this code, there are patterns which look like recursion to the recursion detection and so they are being artificially limited by the unrolling depth. |
Comment author: @damiendoligez So the solution would be to use the |
Comment author: @lpw25 Possibly, although you might make things blow-up pretty quickly. I can't look at the example code right now due to a proxy (perhaps we should discourage using pastebin for reproduction cases), but IIRC it had genuine recursive calls mixed up with higher-order calls that look recursive to flambda. If you change the max unroll count then you're going to inline both of those cases. Same goes for annotating the calls with |
We have decided that the improved tracking of recursion with respect to inlining will only be implemented for Flambda 2.0. We will check the behaviour with this example, but we don't need to keep this issue open, as it appears to be an expected regression in the current compiler. |
Original bug ID: 7849
Reporter: Eugene
Status: confirmed (set by @gasche on 2018-09-15T21:36:12Z)
Resolution: open
Priority: normal
Severity: tweak
Version: 4.07.0
Category: middle end (typedtree to clambda)
Bug description
Here is the code: https://pastebin.com/TfVzyzgC
If compiled with 4.05.0+flambda, it basically optimizes long (>>=) and
map
chain away.On 4.07.0, however, it runs much longer and binary is bigger.
Steps to reproduce
$ ~/.opam/4.05.0+flambda/bin/ocamlopt t.ml -o test_05 -O3
$ ~/.opam/4.07.0+flambda/bin/ocamlopt t.ml -o test_07 -O3
$ time ./test_05
5000000100000000
real 0m0.170s
user 0m0.170s
sys 0m0.000s
$ time ./test_07
5000000100000000
real 0m1.293s
user 0m1.289s
sys 0m0.004s
The text was updated successfully, but these errors were encountered: