Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
OCaml runtime lock does not seem pathological
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2009-06-29 (17:10)
From: Michael Ekstrand <michael+ocaml@e...>
Subject: OCaml runtime lock does not seem pathological
Late last week, a presentation demonstrating substantial scaling
problems with Python's global interpreter lock[1] got a bit of buzz.  I
was interested to see whether OCaml's runtime suffered from a similar
problem, given that it too forbids multiple threads from concurrently
accessing the runtime.

The basic problem outlined is that Python's GIL implementation causes
substantial lock contention, particularly on multicore machines, which
causes computational threads to thrash against each other and can cause
computational threads to keep I/O threads from running.  This latter
problem can result in an obscure priority inversion.

The basic test which demonstrated this problem was a simple loop
counting down from 100000000.  On Python, if two such loops are run in
parallel using threads on a multicore machine, the program takes
substantially longer to finish if the loops are run sequentially.
Disabling one core speeds the program up, but it doesn't recover all of
its original speed.

I duplicated this test with the following code:

let rec count n =
    if n <= 0 then ()
    else count (pred n)

(* count 100000000;; *)
(* count 100000000;; *)

let t1 = Thread.create count 100000000;;
let t2 = Thread.create count 100000000;;
Thread.join t1;;
Thread.join t2;;

Running sequentially on my 1.8 GHz Core 2 Duo laptop (Debian, AMD64)
takes 3.38s user time in byte code and 0.28s user time in native code.
Running threaded with both cores enabled takes the same time.  Running
threaded with the second core disabled also takes about the same time
(byte code is slightly slower).

The take-away from this is that while global locks can cause obscure
performance problems, a very simplistic test suggests that OCaml's
implementation avoids such problems and threaded solutions do scale (as
well as any non-multicore-compatible solution can).  I do not know how
the locking for the OCaml runtime is done, so I do not know if there are
deeper problems of this nature which more sophisticated testing would

- Michael


mouse, n: A device for pointing at the xterm in which you want to type.
Confused by the strange files?  I cryptographically sign my messages.
For more information see <>.