Version française
Home     About     Download     Resources     Contact us    
Browse thread
Efficient I/O with threads
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Yaron Minsky <yminsky@g...>
Subject: Re: Efficient I/O with threads
An addendum.  One thing that was pointed out to me in some private
emails was that buffering could solve the problem on the reading side
as well.  That is true, as far as it  goes --- that's why I said that
I can't think of a _clean_ way of handling it.  One of the nice things
about ocaml IO channels is that they handle buffering, and it seems a
shame to have to reimplement buffering on top of them.

Put another way, the problem with input/output channels appears to be
that the buffering is done on the wrong side of the lock.  You
shouldn't have to do any locking to do IO when the request can be
satisfied from the buffer.  The fact that IO channels always require
you to acquire the lock means that the performance is crappy unless
you bundle up writes by yourself.

Fixing this is perhaps too deep of a change to drive into the OCaml
system at this point.  Is this a problem that is addressed by the I/O
channels provided by any other library such as extlib?

Yaron

On 5/24/05, Yaron Minsky <yminsky@gmail.com> wrote:
> We've been running into some interesting problems building highly
> efficient I/O routines in threaded code in ocaml, and I'm curious if
> anyone else has some thoughts on this.  The basic problem seems to be
> that the locking and unlocking of the IO channels seems to take a
> large fraction of the execution time.
> 
> A little bit of background first.  The data type we're outputting is
> basically a simple s-expression, with the following type:
> 
> type sexp = Atom of string | List of sexp list
> 
> We write out an s-expression by writing a tag-byte to determine
> whether the s-expression is an atom or a string.  If the s-expression
> is an atom, we then write a 4-byte int, which is the length of the
> string, and then the string.  If the s-expression is a list, we write
> an atom which is the number of s-expression that are contained, and
> then write those s-expressions.
> 
> It's very easy to write parsing and marshalling for this type of wire
> protocol, but that code turns out to be quite inefficient, because you
> end up making too many calls to the input and output functions, and
> each one of those calls requires releasing and acquiring locks.  I
> just can't think of a clean way of implementing a reader for this kind
> of protocol.  (a writer could be done by writing stuff to a buffer
> first, and then writing the whole buffer out at the socket at once.)
> 
> Any thoughts?
> Yaron
>