Performance degradation on 4.07.0+flambda #7849

vicuna · 2018-09-15T20:44:50Z

Original bug ID: 7849
Reporter: Eugene
Status: confirmed (set by @gasche on 2018-09-15T21:36:12Z)
Resolution: open
Priority: normal
Severity: tweak
Version: 4.07.0
Category: middle end (typedtree to clambda)

Bug description

Here is the code: https://pastebin.com/TfVzyzgC
If compiled with 4.05.0+flambda, it basically optimizes long (>>=) and map chain away.
On 4.07.0, however, it runs much longer and binary is bigger.

Steps to reproduce

$ ~/.opam/4.05.0+flambda/bin/ocamlopt t.ml -o test_05 -O3
$ ~/.opam/4.07.0+flambda/bin/ocamlopt t.ml -o test_07 -O3
$ time ./test_05
5000000100000000

real 0m0.170s
user 0m0.170s
sys 0m0.000s
$ time ./test_07
5000000100000000

real 0m1.293s
user 0m1.289s
sys 0m0.004s

The text was updated successfully, but these errors were encountered:

vicuna · 2018-09-15T21:36:12Z

Comment author: @gasche

I could reproduce the issue.

You need -O3 for the optimization to kick in, -O2 is not enough.

4.06.1 also performs the expected optimization, so the regression is between 4.06 and 4.07.

vicuna · 2018-09-15T22:26:11Z

Comment author: @gasche

Bisecting seems to indicate that the regression was introduced in

Flambda: Add [Closure_origin.t] to trace inlined functions (merged) (#1707)
#1707

vicuna · 2018-09-16T07:07:48Z

Comment author: @lpw25

I haven't investigated in detail, but from the look of that code and the bisection, I would say that this is an expected regression. That commit is there to prevent the inliner getting into an infinite loop. It is difficult to tell the difference between the code in this example and code that causes an infinite loop.

I have a branch -- worked on in succession by two interns -- that rewrites how the inlining heuristics and limits work so that they have a predictable semantics. This branch should work properly on this kind of example, but it won't be ready until OCaml 4.09.

vicuna · 2018-09-16T07:15:18Z

Comment author: @lpw25

That commit is there to prevent the inliner getting into an infinite loop

I should clarify this part before someone suggests using a timer or a big counter.

That commit adjusts how we detect recursive function calls. Flambda's current semantics are supposed to limit the depth of recursive function calls to the (IIRC) "unrolling-depth" and there were cases before 4.07 where we would fail to detect the recursion. So the bug was that the unrolling depth was being ignored -- as a symptom of this bug we could also get infinite loops.

In this code, there are patterns which look like recursion to the recursion detection and so they are being artificially limited by the unrolling depth.

vicuna · 2018-09-17T14:13:40Z

Comment author: @damiendoligez

So the solution would be to use the -inline-max-unroll command-line option described in http://caml.inria.fr/pub/docs/manual-ocaml/flambda.html#speculation right?

vicuna · 2018-09-17T14:44:49Z

Comment author: @lpw25

Possibly, although you might make things blow-up pretty quickly. I can't look at the example code right now due to a proxy (perhaps we should discourage using pastebin for reproduction cases), but IIRC it had genuine recursive calls mixed up with higher-order calls that look recursive to flambda. If you change the max unroll count then you're going to inline both of those cases. Same goes for annotating the calls with [@unroll n].

mshinwell · 2020-04-17T12:35:32Z

We have decided that the improved tracking of recursion with respect to inlining will only be implemented for Flambda 2.0. We will check the behaviour with this example, but we don't need to keep this issue open, as it appears to be an expected regression in the current compiler.

vicuna added the middle-end label Mar 14, 2019

nojb added the flambda label Mar 16, 2019

vicuna added the bug label Mar 20, 2019

mshinwell closed this as completed Apr 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance degradation on 4.07.0+flambda #7849

Performance degradation on 4.07.0+flambda #7849

vicuna commented Sep 15, 2018

vicuna commented Sep 15, 2018

vicuna commented Sep 15, 2018

vicuna commented Sep 16, 2018

vicuna commented Sep 16, 2018

vicuna commented Sep 17, 2018

vicuna commented Sep 17, 2018

mshinwell commented Apr 17, 2020

Performance degradation on 4.07.0+flambda #7849

Performance degradation on 4.07.0+flambda #7849

Comments

vicuna commented Sep 15, 2018

Bug description

Steps to reproduce

vicuna commented Sep 15, 2018

vicuna commented Sep 15, 2018

vicuna commented Sep 16, 2018

vicuna commented Sep 16, 2018

vicuna commented Sep 17, 2018

vicuna commented Sep 17, 2018

mshinwell commented Apr 17, 2020