Version française
Home     About     Download     Resources     Contact us    
Browse thread
[Caml-list] ocaml-3.05: a performance experience
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Mike Lin <mikelin@M...>
Subject: Re: [Caml-list] ocaml-3.05: a performance experience
> Blocking IO using OS resources is very slow -- O(log N) is good.
> User space dispatching is typically very fast: O(1) [using a hashtable]
> 
> So if you're parsing XML being read over the internet,
> the push technology is superior -- if you can bypass
> the high level socket interface (which is non-trivial :-(

I'm still not sure I understand this. Pull parsers can be structured so
that they don't require blocking I/O. That was the point of all the CPS
stuff in Yaxpo, and what you call "control inversion" (which I'm pretty
sure is also CPS) in Felix.

The controlling system can control the I/O however it wants, but the
interface to the parser still looks like pull.

Do you consider 'select' and 'read' part of the "high level socket
interface" that should be bypassed? (I just want to clarify because
OCaml has even higher level interfaces)

> 	Absolutely! That is why Felix exists. To allow one to
> write 'pull' code, which does blocking reads of data, but have the
> code translated to 'push' code, where a thread is resumed by a
> dispatcher when the data is available.
> 
> 	For some applications like telephony, where the number
> of connections is rather large, event driven code is the ONLY option.
> Unix OS are typically incapable of handling millions of threads, a user
> space dispatcher has no trouble at all, even on a small Linux PC.
> The reason is: client level code 'knows' the encoding of messages
> and can sort out where to send it much faster than any generic
> OS facilities can: OS schedulers are designed to run arbitrary
> mixtures of programs, not millions of threads of the same application.

It would be wise not to eschew the OS scheduler entirely. There are a
number of limitations to this type of purely event-driven model under
heavy load. In particular, it assumes that a "thread" never blocks, but
in fact it may block in many cases that are not under the control of
your scheduler - page faults, garbage collection, and so forth. In such
cases you can squeeze some extra processing time out of the system by
having several OS threads.

If you haven't already, you should have a look at Matt Welsh's work with
SEDA (http://www.cs.berkeley.edu/~mdw/proj/sandstorm/). I'm not sure if
it already does or not, but if it doesn't, it would be very interesting
to have a way to build custom event queue managers into Felix's
scheduler.

-Mike

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners