| Anonymous | Login | Signup for a new account | 2013-05-19 03:31 CEST | ![]() |
| Main | My View | View Issues | Change Log | Roadmap |
| View Issue Details [ Jump to Notes ] | [ Issue History ] [ Print ] | |||||||
| ID | Project | Category | View Status | Date Submitted | Last Update | |||
| 0005205 | OCaml | OCaml general | public | 2011-01-17 10:23 | 2012-09-25 20:38 | |||
| Reporter | frisch | |||||||
| Assigned To | xleroy | |||||||
| Priority | high | Severity | minor | Reproducibility | have not tried | |||
| Status | closed | Resolution | fixed | |||||
| Platform | OS | OS Version | ||||||
| Product Version | ||||||||
| Target Version | Fixed in Version | 3.13.0+dev | ||||||
| Summary | 0005205: Performance bug: pushing let-alias inside loop body | |||||||
| Description | ocamlopt inlines some local bindings that are pure and used once. However, the notion of "used once" is syntactic, which allows some non-trivial expressions (typically field projection) to be pushed within a loop body instead of being evaluated once outside the loop. Here is an example: ================================================================= type t1 = {x:int} type t2 = {y:t1} let f {y = {x = x1}} {y = {x = x2}} {y = {x = x3}} {y = {x = x4}} = (* let x1 = abs x1 in let x2 = abs x2 in let x3 = abs x3 in let x4 = abs x4 in *) let r = ref 0 in for i = 1 to 1000000 do for j = 1 to 1000 do r := x1 + x2 + x3 + x4; done done; !r let () = let r = {y = { x = 2 }} in Printf.printf "%i\n%!" (f r r r r) ================================================================= On a Linux x64 (Intel Core i7, 2.80 GHz), this program takes 2.34 seconds and only 1.18 seconds after uncommenting the first lines in f. The effect of these lines is to force the field accesses to be done outside the loop. (On simpler loops, the processor does an amazing job of removing the indirection overhead.) The same problem appear because of pushing field accessed inside lambdas, e.g.: ================================================================= let rec f () = r := x1 + x2 + x3 + x4 in for i = 1 to 1000000 do for j = 1 to 1000 do f () done done; ================================================================= (Timing goes from 5s to 1.5s by enabling the 'let xi = abs xi' lines.) | |||||||
| Tags | No tags attached. | |||||||
| Attached Files | ||||||||
Notes |
|
|
(0005773) frisch (developer) 2011-01-17 10:27 |
The faulty optimization is implemented in Simplif.simplify_lets. I suggest to keep track of the depth where let variables are bound (bumping the depth when entering a lambda or loop body) and used, and not inline aliases which are used at a deeper level than their definition. |
|
(0005774) frisch (developer) 2011-01-17 11:08 |
Suggested fix attached. |
|
(0005777) frisch (developer) 2011-01-17 14:22 |
As an extra benefit, the patch avoids keeping references to useless parts of data structures. I'm wondering whether inlining let-aliases used only once is really a good heuristic. Isn't it generally better to load from memory early in order to improve pipelining? Has someone done any benchmark to see if the heuristic is useful? |
Issue History |
|||
| Date Modified | Username | Field | Change |
| 2011-01-17 10:23 | frisch | New Issue | |
| 2011-01-17 10:27 | frisch | Note Added: 0005773 | |
| 2011-01-17 11:08 | frisch | File Added: fix_5205.diff | |
| 2011-01-17 11:08 | frisch | Note Added: 0005774 | |
| 2011-01-17 14:22 | frisch | Note Added: 0005777 | |
| 2011-05-17 16:48 | doligez | Status | new => acknowledged |
| 2011-05-17 16:48 | doligez | Priority | normal => high |
| 2011-06-10 09:14 | frisch | Relationship added | has duplicate 0005288 |
| 2011-06-11 15:16 | xleroy | Assigned To | => xleroy |
| 2011-06-11 15:16 | xleroy | Status | acknowledged => resolved |
| 2011-06-11 15:16 | xleroy | Fixed in Version | => 3.13.0+dev |
| 2012-09-25 20:38 | xleroy | Status | resolved => closed |
| 2012-09-25 20:38 | xleroy | Resolution | open => fixed |
| Copyright © 2000 - 2011 MantisBT Group |