Version française
Home     About     Download     Resources     Contact us    
Browse thread
Performance of threaded interpreter on hyper-threaded CPU
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Xavier Leroy <Xavier.Leroy@i...>
Subject: Re: [Caml-list] Re: Performance of threaded interpreter on hyper-threaded CPU
 > However, your remark motivated me to measure the performance of a
 > single ocamlrun executable running on the various Pentium 4 I have at
 > hand, and the results are interesting...

Random thoughts:

The performance variations between the gcc versions confirm my
impression that gcc is getting "too clever for its own good" --
carefully hand-optimized code like the OCaml bytecode interpreter
is best served by a compiler that compiles code nearly as written.
(Think gcc 2.95.)

The P4 microarchitecture is known for its weird performance model:
some code runs very fast, some similar code very slow.
In my experience, AMD processors as well as the Pentium-M are
much more consistent performance-wise.

If you really want to understand what's going on, you need a good
performance analysis tool.  Timing runs will tell you nothing.
Intel's VTUNE is king of the hill here, but the Windows version is
costly and I could never install the free Linux version.

- Xavier Leroy