Browse thread
Re: [Caml-list] Severe loss of performance due to new signal handling (fwd)
-
Brian Hurt
- Alexander S. Usov
- Robert Roessler
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: | 2006-03-22 (10:56) |
From: | Robert Roessler <roessler@r...> |
Subject: | Re: [Caml-list] Severe loss of performance due to new signal handling (fwd) |
Brian Hurt wrote: > > ---------- Forwarded message ---------- > Date: Tue, 21 Mar 2006 16:32:51 -0600 (CST) > From: Brian Hurt <bhurt@spnz.org> > To: Robert Roessler <roessler@rftp.com> > Subject: Re: [Caml-list] Severe loss of performance due to new signal > handling > > On Tue, 21 Mar 2006, Robert Roessler wrote: > >> Well, I *thought* there was a marked absence of "bit-level >> parallelism" in the signal-handling... ;) >> >> So the "expense" of individual atomic operations is not really what is >> at the heart of this performance problem... > > Hmm. Maybe not. I'm measuring a 4 clock cycle cost for a xchgl, both > with and without a lock on my Athlon XP 1.8GHz. See attached code. > Naturally, this is a uniprocessor machine and the memory location is in > L1 cache (or will be soon), and no contention, so this is definately > best case. 4 clocks is about rights for a read and a write to L1 cache > (each L1 cache access taking 2 clocks). And after adjusting the inline assembly syntax for vc7.1, I get Minimum time for a rdtsc instruction (in clocks): 38 Minimum time for a read_and_clear() + rdtsc (in clocks): 75 This is on a P-III S (Tualatin) @ 1.4GHz on Windows XP SP2. Robert Roessler roessler@rftp.com http://www.rftp.com