Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
Severe loss of performance due to new signal handling
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2006-03-20 (09:29)
From: Xavier Leroy <Xavier.Leroy@i...>
Subject: Re: [Caml-list] Severe loss of performance due to new signal handling
 > It seems that changes to signal handling between OCaml 3.08.4 and 3.09.1
 > can lead to a very significant loss of performance (up to several orders
 > of magnitude!) in code that uses threads and performs I/O (tested on Linux).
 > [...]
 > Maybe some assembler guru can repeat this result and explain to us
 > what's going on...

Short explanation: atomic instructions are dog slow.

Longer explanation:

OCaml 3.09 fixed a number of long-standing bugs in signal handling
that could cause signals to be "lost" (not acted upon).  The fixes,
located mostly in the code that polls for pending signals
(caml_process_pending_signals), rely on an atomic "read-and-clear"
operation, implemented using atomic processor instructions on x86,
x86-64 and PPC.  This makes signal handling correct (no signal can be
lost) but I didn't realize that it has such an impact on performance,
even on a uniprocessor machine.  Thanks for pointing this out.

(To prevent a number of well-meaning but irrelevant posts, keep in
mind that we're using atomic instructions in a single-threaded
program, to get atomicity w.r.t. signals, not w.r.t. concurrent threads.
We don't need the latter kind of atomicity given OCaml's threading model.)

Now, you may wonder why the problem appears mainly with threaded
programs.  The reason is that programs linked with the Thread library,
even if they do not create threads, check for signals much more
often, because they enter and leave blocking sections more often.  In
your example, each call to "print_char" needs to lock and unlock the
stdout channel, causing two signal polls each time.

So, it's time to go back to the drawing board.  Fortunately, it
appears that reliable polling of signals is possible without atomic
processor instructions.  Expect a fix in 3.09.2 at the latest, and
probably within a couple of weeks in the CVS.


- Xavier Leroy