New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
alloc of let-rec values is expensive (patch) #5384
Comments
Comment author: @xavierleroy I was a bit surprised by this patch: "let rec" over values is a very rarely used Caml feature (I don't think I've used it in the last 5 years, and some people consider it a design mistake of the language, since it makes it possible that structurally recursive functions over immutable datatypes fail to terminate), so I didn't understand the need to optimize its compilation. Is there a real-world program where this optimization has a significant impact on the overall execution time? |
Comment author: @alainfrisch Fabrice gives the example of the Queue module, and some timings. I admit that the performance gain would probably be less impressive in a real-world algorithm, where Queue operations do not dominate the overall execution time. But it is the sign that some clever data structures can benefit from the patch. I remember using let-rec bindings several times when writing interpreters (to evaluate recursive closures, of course). I can also mention that we (=LexiFi) rely a lot on "let rec" values these days, in the form or lazy bindings (we have special syntactic sugar for that). I don't know if performance would be impacted by Fabrice's patch, though. |
Comment author: @gasche Just a few comments on the patch; this is an outsider point of view, High-level remarks:
Nitpicking:
|
Comment author: @lefessan
Your method: My method: |
Comment author: @xavierleroy Your asm code excerpts don't answer my question: can you demonstrate non-negligible speedups on a real-world application to justify adding 150 lines of rather complex code to the compiler? |
Comment author: @xavierleroy This PR has been sleeping for too long. If this performance improvement is still of interest, I suggest to submit it as a Github pull request so that it gets discussed and reviewed better. |
Original bug ID: 5384
Reporter: @lefessan
Assigned to: @xavierleroy
Status: resolved (set by @xavierleroy on 2016-12-07T11:01:50Z)
Resolution: won't fix
Priority: normal
Severity: feature
Platform: all
OS: all
OS Version: all
Version: 3.12.1
Target version: later
Category: ~DO NOT USE (was: OCaml general)
Tags: patch
Monitored by: @protz mehdi dario @ygrek
Bug description
Allocation of let rec values is not optimal: each value is allocated separately using "caml_alloc_dummy", and then they are allocated a second time, and then the initial allocated blocks are updated using "caml_update_dummy", making an heavy use of "caml_modify".
The following patch modifies the allocation of let-rec values for native code.
It tries a better allocation strategy, and fall-backs to the old strategy if
the expression is not simple enough. The idea of the better allocation strategy is to break recursive expressions into simple non-recursive values (that can be evaluated before), and recursive blocks (that can all be allocated as one big block and then modified to add recursive pointers). The strategy fails if the block is too big and cannot be allocated in the minor heap.
Additional information
Some performance measurements:
type t = { f : float; x : t option; }
let _ =
for i = 1 to int_of_string Sys.argv.(1) do
let rec v = { f = 0.0; x = Some v } in
()
done
76% speed-up:
time ./test2.opt 100000000
0.55 s (optimized)
2.32 s (old)
let _ =
let q = Queue.create () in
for i = 1 to int_of_string Sys.argv.(1) do
Queue.add 1 q;
assert (Queue.take q = 1);
done
50% speed-up:
time ./test1.opt 100000000
2.16 s (optimized)
4.26 s (old)
let _ =
for i = 1 to int_of_string Sys.argv.(1) do
let q = Queue.create () in
Queue.add 1 q;
Queue.add 2 q;
Queue.add 3 q;
assert (Queue.take q = 1);
assert (Queue.take q = 2);
assert (Queue.take q = 3);
done
30% speed-up:
time ./test1.opt 100000000
5.24 s (optimized)
7.28 s (old)
File attachments
The text was updated successfully, but these errors were encountered: