New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Caml thread wont yield #5373
Comments
Comment author: @lefessan I attached a new "st_posix.h" that I think fixes this problem. It introduces a queue of active threads, waiting to take the master lock. When a thread releases the lock, all threads are woken up, and only the first one on the queue can take it. So, scheduling between these threads should be mostly fair. It is implemented using a cons put on the thread local stack, so it has no limit on the number of active threads, and the queue has two pointers, one to the head, one to the tail, so that insertion and deletions are in one step. It would probably not work if it were possible to kill threads, but killing threads is not possible with OCaml systhreads, so it is ok. |
Comment author: @xavierleroy Fairness in thread scheduling is a delicate issue: whole books have been written about it. Linux's (2.6 and up) scheduler is willing to sacrifice fairness entirely to wring a little more throughput out of the system (this was confirmed to me by one of the authors of said scheduler). In particular, sched_yield(), which used to implement a pretty good approximation to round-robin, was deliberately sabotaged in 2.6 (it makes the yielding thread give up its time quota entirely), rendering it unusable with Caml's system threads. (OpenOffice was similarly affected.) In the end, I don't think we care enough about fairness in Caml to try to restore it on top of an unfair OS scheduler. "Real" threaded applications (those doing I/O and inter-thread synchronization) work just fine without fairness. I reclassify this PR as a feature wish, but doubt it will be granted. Concerning Fabrice Le Fessant's patch: restarting all threads at every yield in order for just one to run and all other to go back to sleep is a complete no-no. Try it with 500 threads and see... |
Comment author: @lefessan 500 threads ? I didn't know there were such OCaml applications. Anyway, above a few tens of threads, using threads for I/O is a complete waste of resources, and the runtime should give incentives to avoid that (by using an expensive fair scheduler ;-) ). Anyway, it is still possible to have one condition per thread (allocated once on thread creation), and to signal only the condition of the first thread on the queue when the lock is released. Then, any thread entering the queue would also make sure it signals that first thread too to avoid stalling the scheduler. I will experiment some code around this. |
Comment author: @ygrek Just for the record, I have an application with circa 1000 threads running just now :) Most of the threads - when active - are executing C++ code but are launched from ocaml. Not that it couldn't have used 10 times less threads, but it was most easy to code this way :) |
Comment author: @lefessan I uploaded a new patch: this one uses one condition per waiting thread, so only the first one on top of the queue is signaled. Conditions are allocated when a thread needs to wait, if none is available. Otherwise, they are reused. Every thread that terminates tries to free one condition, so the number of allocated conditions is bound by the number of threads, never more, probably always much fewer. |
Comment author: @damiendoligez So how are you going to document this feature? By saying that scheduling is guaranteed to be fair? |
Comment author: @lefessan If I remember well, the patch ensures that scheduling is fair, only among OCaml threads. |
#2112 should have dealt with this. |
Original bug ID: 5373
Reporter: NickChapman
Status: acknowledged (set by @xavierleroy on 2011-10-16T16:55:09Z)
Resolution: open
Priority: normal
Severity: feature
Version: 3.12.1
Category: otherlibs
Tags: patch
Monitored by: mehdi @ygrek sweeks
Bug description
I have an example of an ocaml program using threads, which shows one
thread failing to yield & causing the execution of a second thread to
be completely blocked. This happens despite the non-yielding thread
continuously allocating.
example.ml
build using:
ocamlopt -thread -c example.ml
ocamlopt.opt -thread unix.cmxa threads.cmxa example.cmx -o example.exe
The behaviour when running this program is not deterministic, but more
often that not it has output something like:
The point to look for is that there are no "beat" messages for the 5
seconds between "spin a while" & "...done". (Note: sometimes there are
beats while spinning, and sometimes they start and stop - but in most
runs there are none.)
It seems the spinning thread is correctly, periodically releasing &
re-aquiring the lock, every 50ms when the c-ticker thread makes
allocation fail the heap limit check, but that this in not enough to
make the thread yield. It seems that previously sched_yield() was
used, but the call is currently elided.
ocaml/otherlibs/systhreads/st_stubs.c
ocaml/otherlibs/systhreads/st_posix.h
Without sched_yield, the sequence of calls executed by the thread
owning the masterlock reduces to...
enter_blocking_section();
leave_blocking_section();
which reduces to...
st_masterlock_release
st_masterlock_acquire
which reduces to..
If another thread is already waiting on the condition variable & mutex
pair, we might hope that it will be next to acquire the mutex, but the
semantics of posix threads offer no such guarantees.
And in practice is appears that more often than not, the same thread
which just unlocked the mutex immediately re-locks it. And so even in
examples with just two threads contending for the mytex, one thread
can be permanently starved.
File attachments
The text was updated successfully, but these errors were encountered: