|Anonymous | Login | Signup for a new account||2019-01-21 18:37 CET|
|Main | My View | View Issues | Change Log | Roadmap|
|View Issue Details|
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0007849||OCaml||middle end (typedtree to clambda)||public||2018-09-15 22:44||2018-09-17 16:44|
|Target Version||Fixed in Version|
|Summary||0007849: Performance degradation on 4.07.0+flambda|
|Description||Here is the code: https://pastebin.com/TfVzyzgC [^]|
If compiled with 4.05.0+flambda, it basically optimizes long (>>=) and `map` chain away.
On 4.07.0, however, it runs much longer and binary is bigger.
|Steps To Reproduce||$ ~/.opam/4.05.0+flambda/bin/ocamlopt t.ml -o test_05 -O3|
$ ~/.opam/4.07.0+flambda/bin/ocamlopt t.ml -o test_07 -O3
$ time ./test_05
$ time ./test_07
|Tags||No tags attached.|
I could reproduce the issue.
You need -O3 for the optimization to kick in, -O2 is not enough.
4.06.1 also performs the expected optimization, so the regression is between 4.06 and 4.07.
Bisecting seems to indicate that the regression was introduced in
Flambda: Add [Closure_origin.t] to trace inlined functions (merged) (0001707)
I haven't investigated in detail, but from the look of that code and the bisection, I would say that this is an expected regression. That commit is there to prevent the inliner getting into an infinite loop. It is difficult to tell the difference between the code in this example and code that causes an infinite loop.
I have a branch -- worked on in succession by two interns -- that rewrites how the inlining heuristics and limits work so that they have a predictable semantics. This branch should work properly on this kind of example, but it won't be ready until OCaml 4.09.
> That commit is there to prevent the inliner getting into an infinite loop
I should clarify this part before someone suggests using a timer or a big counter.
That commit adjusts how we detect recursive function calls. Flambda's current semantics are supposed to limit the depth of recursive function calls to the (IIRC) "unrolling-depth" and there were cases before 4.07 where we would fail to detect the recursion. So the bug was that the unrolling depth was being ignored -- as a symptom of this bug we could also get infinite loops.
In this code, there are patterns which look like recursion to the recursion detection and so they are being artificially limited by the unrolling depth.
|So the solution would be to use the `-inline-max-unroll` command-line option described in http://caml.inria.fr/pub/docs/manual-ocaml/flambda.html#speculation [^] right?|
|Possibly, although you might make things blow-up pretty quickly. I can't look at the example code right now due to a proxy (perhaps we should discourage using pastebin for reproduction cases), but IIRC it had genuine recursive calls mixed up with higher-order calls that look recursive to flambda. If you change the max unroll count then you're going to inline both of those cases. Same goes for annotating the calls with `[@unroll n]`.|
|2018-09-15 22:44||Eugene||New Issue|
|2018-09-15 23:36||gasche||Note Added: 0019365|
|2018-09-15 23:36||gasche||Status||new => confirmed|
|2018-09-16 00:26||gasche||Note Added: 0019366|
|2018-09-16 09:07||lpw25||Note Added: 0019367|
|2018-09-16 09:15||lpw25||Note Added: 0019368|
|2018-09-17 16:13||doligez||Note Added: 0019369|
|2018-09-17 16:44||lpw25||Note Added: 0019370|
|Copyright © 2000 - 2011 MantisBT Group|