Browse thread
Severe loss of performance due to new signal handling
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: | 2006-03-20 (09:29) |
From: | Xavier Leroy <Xavier.Leroy@i...> |
Subject: | Re: [Caml-list] Severe loss of performance due to new signal handling |
> It seems that changes to signal handling between OCaml 3.08.4 and 3.09.1 > can lead to a very significant loss of performance (up to several orders > of magnitude!) in code that uses threads and performs I/O (tested on Linux). > [...] > Maybe some assembler guru can repeat this result and explain to us > what's going on... Short explanation: atomic instructions are dog slow. Longer explanation: OCaml 3.09 fixed a number of long-standing bugs in signal handling that could cause signals to be "lost" (not acted upon). The fixes, located mostly in the code that polls for pending signals (caml_process_pending_signals), rely on an atomic "read-and-clear" operation, implemented using atomic processor instructions on x86, x86-64 and PPC. This makes signal handling correct (no signal can be lost) but I didn't realize that it has such an impact on performance, even on a uniprocessor machine. Thanks for pointing this out. (To prevent a number of well-meaning but irrelevant posts, keep in mind that we're using atomic instructions in a single-threaded program, to get atomicity w.r.t. signals, not w.r.t. concurrent threads. We don't need the latter kind of atomicity given OCaml's threading model.) Now, you may wonder why the problem appears mainly with threaded programs. The reason is that programs linked with the Thread library, even if they do not create threads, check for signals much more often, because they enter and leave blocking sections more often. In your example, each call to "print_char" needs to lock and unlock the stdout channel, causing two signal polls each time. So, it's time to go back to the drawing board. Fortunately, it appears that reliable polling of signals is possible without atomic processor instructions. Expect a fix in 3.09.2 at the latest, and probably within a couple of weeks in the CVS. Regards, - Xavier Leroy