Browse thread
Performance of threaded interpreter on hyper-threaded CPU
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: | 2006-04-18 (16:18) |
From: | Xavier Leroy <Xavier.Leroy@i...> |
Subject: | Re: [Caml-list] Re: Performance of threaded interpreter on hyper-threaded CPU |
> However, your remark motivated me to measure the performance of a > single ocamlrun executable running on the various Pentium 4 I have at > hand, and the results are interesting... Random thoughts: The performance variations between the gcc versions confirm my impression that gcc is getting "too clever for its own good" -- carefully hand-optimized code like the OCaml bytecode interpreter is best served by a compiler that compiles code nearly as written. (Think gcc 2.95.) The P4 microarchitecture is known for its weird performance model: some code runs very fast, some similar code very slow. In my experience, AMD processors as well as the Pentium-M are much more consistent performance-wise. If you really want to understand what's going on, you need a good performance analysis tool. Timing runs will tell you nothing. Intel's VTUNE is king of the hill here, but the Windows version is costly and I could never install the free Linux version. - Xavier Leroy