Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caml thread wont yield #5373

Closed
vicuna opened this issue Oct 10, 2011 · 8 comments
Closed

Caml thread wont yield #5373

vicuna opened this issue Oct 10, 2011 · 8 comments

Comments

@vicuna
Copy link

vicuna commented Oct 10, 2011

Original bug ID: 5373
Reporter: NickChapman
Status: acknowledged (set by @xavierleroy on 2011-10-16T16:55:09Z)
Resolution: open
Priority: normal
Severity: feature
Version: 3.12.1
Category: otherlibs
Tags: patch
Monitored by: mehdi @ygrek sweeks

Bug description

I have an example of an ocaml program using threads, which shows one
thread failing to yield & causing the execution of a second thread to
be completely blocked. This happens despite the non-yielding thread
continuously allocating.

example.ml

let pr s = Printf.eprintf "%.3f : %s\n%!" (Unix.gettimeofday()) s

let nanosleep f = ignore (Unix.select [] [] [] f)

let rec beater_loop () =
  pr "beat";
  nanosleep 0.1;
  beater_loop()

let rec spin_loop p i =
  let p = (let (a,b) = p in (b,a)) in (* allocate every iteration *)
  if i = 0
  then ()
  else spin_loop p (i-1)

let main () =
  let _t = Thread.create beater_loop () in
  nanosleep 1.0;
  pr "spin a while";
  let p = ("foo","bar") in
  spin_loop p 2_000_000_000; (* approx 5 secs *)
  pr "spin a while..done";
  nanosleep 1.0;
  exit 0

let () = main()

build using:
ocamlopt -thread -c example.ml
ocamlopt.opt -thread unix.cmxa threads.cmxa example.cmx -o example.exe

The behaviour when running this program is not deterministic, but more
often that not it has output something like:

1317804008.719 : beat
1317804008.818 : beat
1317804008.918 : beat
1317804009.019 : beat
1317804009.119 : beat
1317804009.219 : beat
1317804009.319 : beat
1317804009.419 : beat
1317804009.519 : beat
1317804009.619 : beat
1317804009.718 : spin a while
1317804014.271 : spin a while..done
1317804014.271 : beat
1317804014.372 : beat
1317804014.473 : beat
1317804014.574 : beat
1317804014.675 : beat
1317804014.776 : beat
1317804014.877 : beat
1317804014.978 : beat
1317804015.079 : beat
1317804015.180 : beat

The point to look for is that there are no "beat" messages for the 5
seconds between "spin a while" & "...done". (Note: sometimes there are
beats while spinning, and sometimes they start and stop - but in most
runs there are none.)

It seems the spinning thread is correctly, periodically releasing &
re-aquiring the lock, every 50ms when the c-ticker thread makes
allocation fail the heap limit check, but that this in not enough to
make the thread yield. It seems that previously sched_yield() was
used, but the call is currently elided.

ocaml/otherlibs/systhreads/st_stubs.c

CAMLprim value caml_thread_yield(value unit)        /* ML */
{
  if (st_masterlock_waiters(&caml_master_lock) == 0) return Val_unit;
  enter_blocking_section();
  st_thread_yield();
  leave_blocking_section();
  return Val_unit;
}

ocaml/otherlibs/systhreads/st_posix.h

static void INLINE st_thread_yield(void)
{
#ifndef __linux__
  /* sched_yield() doesn't do what we want in Linux 2.6 and up (#2663) */
  sched_yield();
#endif
}

Without sched_yield, the sequence of calls executed by the thread
owning the masterlock reduces to...

enter_blocking_section();
leave_blocking_section();

which reduces to...

st_masterlock_release
st_masterlock_acquire

which reduces to..

  pthread_mutex_unlock(&m->lock);
  pthread_cond_signal(&m->is_free);

  pthread_mutex_lock(&m->lock);

If another thread is already waiting on the condition variable & mutex
pair, we might hope that it will be next to acquire the mutex, but the
semantics of posix threads offer no such guarantees.

And in practice is appears that more often than not, the same thread
which just unlocked the mutex immediately re-locks it. And so even in
examples with just two threads contending for the mytex, one thread
can be permanently starved.

File attachments

@vicuna
Copy link
Author

vicuna commented Oct 11, 2011

Comment author: @lefessan

I attached a new "st_posix.h" that I think fixes this problem. It introduces a queue of active threads, waiting to take the master lock. When a thread releases the lock, all threads are woken up, and only the first one on the queue can take it. So, scheduling between these threads should be mostly fair.

It is implemented using a cons put on the thread local stack, so it has no limit on the number of active threads, and the queue has two pointers, one to the head, one to the tail, so that insertion and deletions are in one step. It would probably not work if it were possible to kill threads, but killing threads is not possible with OCaml systhreads, so it is ok.

@vicuna
Copy link
Author

vicuna commented Oct 16, 2011

Comment author: @xavierleroy

Fairness in thread scheduling is a delicate issue: whole books have been written about it. Linux's (2.6 and up) scheduler is willing to sacrifice fairness entirely to wring a little more throughput out of the system (this was confirmed to me by one of the authors of said scheduler). In particular, sched_yield(), which used to implement a pretty good approximation to round-robin, was deliberately sabotaged in 2.6 (it makes the yielding thread give up its time quota entirely), rendering it unusable with Caml's system threads. (OpenOffice was similarly affected.)

In the end, I don't think we care enough about fairness in Caml to try to restore it on top of an unfair OS scheduler. "Real" threaded applications (those doing I/O and inter-thread synchronization) work just fine without fairness. I reclassify this PR as a feature wish, but doubt it will be granted.

Concerning Fabrice Le Fessant's patch: restarting all threads at every yield in order for just one to run and all other to go back to sleep is a complete no-no. Try it with 500 threads and see...

@vicuna
Copy link
Author

vicuna commented Oct 17, 2011

Comment author: @lefessan

500 threads ? I didn't know there were such OCaml applications. Anyway, above a few tens of threads, using threads for I/O is a complete waste of resources, and the runtime should give incentives to avoid that (by using an expensive fair scheduler ;-) ).

Anyway, it is still possible to have one condition per thread (allocated once on thread creation), and to signal only the condition of the first thread on the queue when the lock is released. Then, any thread entering the queue would also make sure it signals that first thread too to avoid stalling the scheduler. I will experiment some code around this.

@vicuna
Copy link
Author

vicuna commented Oct 17, 2011

Comment author: @ygrek

Just for the record, I have an application with circa 1000 threads running just now :) Most of the threads - when active - are executing C++ code but are launched from ocaml. Not that it couldn't have used 10 times less threads, but it was most easy to code this way :)

@vicuna
Copy link
Author

vicuna commented Oct 18, 2011

Comment author: @lefessan

I uploaded a new patch: this one uses one condition per waiting thread, so only the first one on top of the queue is signaled. Conditions are allocated when a thread needs to wait, if none is available. Otherwise, they are reused. Every thread that terminates tries to free one condition, so the number of allocated conditions is bound by the number of threads, never more, probably always much fewer.

@vicuna
Copy link
Author

vicuna commented Oct 9, 2013

Comment author: @damiendoligez

So how are you going to document this feature? By saying that scheduling is guaranteed to be fair?

@vicuna
Copy link
Author

vicuna commented Oct 9, 2013

Comment author: @lefessan

If I remember well, the patch ensures that scheduling is fair, only among OCaml threads.

@mshinwell
Copy link
Contributor

#2112 should have dealt with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants