Version française
Home     About     Download     Resources     Contact us    
Browse thread
[Caml-list] FP's and HyperThreading Processors
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: David McClain <dmcclain1@m...>
Subject: [Caml-list] FP's and HyperThreading Processors
Hi,

I have a massive application that performs nonlinear fitting to 170+
parameters; a phase retrieval problem for discerning the aberrations of an
optical system. This program is written largely in compiled OCaml closures,
along with a multithreaded vendor supplied FFT routine (presumably optimized
for their processor).

On an old P2 single processor machine at 350 MHz, I am seeing almost 95%+
CPU utilization. But on a new 3 GHz P4 with HyperThreading enabled (dual
register banks for fast context switching and minimum cache coherence
overhead), this same program provides much less than 5% CPU utilization. The
net result is that this program runs only twice as fast on the new 3 GHz P4
as it runs on the old 350 MHz P2.

I suspect, but have yet to prove, that the low utilization is due to a low
CPU to memory bandwidth and to the failure of the L1 and L2 caches to supply
needed operations and data to the CPU. This, I would hypothesize, is going
to be demonstrated by any language that prefers fresh memory allocation for
results, e.g., OCaml, ML, Lisp, Smalltalk, etc.

If I am correct, then it implies that our hardware friends are moving
rapidly in the opposite direction to our advanced software systems. I
mention this in order to tickle the imaginations of both camps.

Cheers,

- David McClain, Sr. Scientist, Raytheon, Tucson, AZ


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners