Browse thread
Objective Caml 2.02
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Alexey Nogin <nogin@c...> |
| Subject: | Re: Upgrade from OCaml 2.01 to OCaml 2.02 made things _slower_! |
Xavier Leroy wrote: > In the past, I've observed speed variations by at least +/- 5% caused > exclusively by minor variations in code placement (such as adding or > deleting instructions that are never executed). Almost any > modification in the code generator affects code placement. If only > for this reason, speed variations of less than 5% are essentially > meaningless: there's no way to attribute them to a particular > otpimization or to good/bad luck in code placement. (Makes you very > suspicious of those PLDI papers where they report 1% speedups...) > > > Also, I was doing some performance mesurements (using P6 performance > > counter support patches for Linux by Erik Hendriks - > > http://beowulf.gsfc.nasa.gov/software/ ) when I upgraded, so I have some > > information (and can get more of it) on the performance counters for my > > program under both 2.01 and 2.02. In particular, the number of requests > > from the processor to the L1 data cache became 2%-3% bigger. > > That's more meaningful. The two new optimizations in 2.02 (closed > toplevel functions and allocation coalescing) should reduce the number > of memory accesses. Allocation coalescing might increase register > pressure locally, causing other stuff to be spilled on the stack, > though. Well, in this case I should probably try to remove the allocation coalescing and see what happens. Am I right assuming that in order to do that I have to revert changes for versions 1.8 -> 1.9 and 1.10 -> 1.11 of the asmcomp/selectgen.ml? > Is there any way you could get a per-function profile of > memory requests? (like on the Alpha with the Digital Unix tools). I am not sure. I could probably write something gprof-like that would record the values of the performance counters at each function call, but I am afraid that's a lot of work. And I could probably get access to Alpha, but I do not think I will see the same slowdown effect on Alpha as I see on x86, so the Alpha memory access numbers would not help much. Alexey -------------------------------------------------------------- Home Page: http://www.cs.cornell.edu/nogin/ E-Mail: nogin@cs.cornell.edu (office), ayn2@cornell.edu (home) Office: Upson 4139, tel: 1-607-255-4934 ICQ #: 24708107 (office), 24678341 (home)