English version
Accueil     À propos     Téléchargement     Ressources     Contactez-nous    

Ce site est rarement mis à jour. Pour les informations les plus récentes, rendez-vous sur le nouveau site OCaml à l'adresse ocaml.org.

Browse thread
Help interfacing with C
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2006-08-19 (00:44)
From: Nathaniel Gray <n8gray@g...>
Subject: Re: [Caml-list] Re: Help interfacing with C
On 8/18/06, Bardur Arantsson <spam@scientician.net> wrote:
> Nathaniel Gray wrote:
> > On 8/16/06, Bardur Arantsson <spam@scientician.net> wrote:
> >> Nathaniel Gray wrote:
> >> > Hi folks,
> >> [--snip--]
> >
> > Thanks, but this doesn't really help.  Perhaps I should say what I'm
> > trying to accomplish instead of trying to let the types speak for me.
> > The point is that when you select on a list of file descriptors
> > there's a very good chance that you don't *only* care that file
> > descriptor f is ready, you also have some ocaml data structure
> > associated with f that you want to use with it in some way.  For
> > example, you probably have a string to write or a buffer to read into.
> >
> > The current API is sort of a minimal translation of the select system
> > call to OCaml, which is a good starting point, but not great IMHO.  In
> > order to match up each file descriptor to its data you have to scan
> > the output fd list yet again, after it's already been done once at the
> > C level, and since you don't know how/if it's sorted you end up with
> > unsatisfying time complexity for the whole thing.  It's not critical,
> > since the lists are small, but it hardly brings a smile to your face.
> >
> > On the other hand, imagine a select2 that takes lists of (fd, 'a)
> > pairs and returns the same.  The results become incredibly easy to
> > handle, with no scanning necessary.  You probably just List.iter a
> > simple read/write function over the result and you're done.
> >
> Have you tried to benchmark the naive implementation(*) versus a "bare"
> Unix.select? I shouldn't think you a little bit extra of scanning the
> list which is O(n) in the number of _active_ file descriptors matters
> with all the O(MAX_FD) stuff that's going on in select(). I would expect
> most of the time to be spent doing the actual system call.
> (*) That is, an implementation which puts (fd, associated value) in a
> hashtable indexed by the file_descr and does the scanning after doing
> the Unix.select.

Let me be clear about this -- it's not really about performance.  I'm
looking at the entire sequence of operations and data structures
needed to make a select call.  Here's a typical example with the
current API:

(* You have a data structure with some file descriptors and auxilliary data *)
let foo = [(fd1, "hello"); (fd2, "world"); (fd3, "baz")] in
(* Now you need a list of file descriptors *)
let fds = List.map fst foo in
(* Feed them to select *)
let _, outs, _ = Unix.select [] fds [] (-1.0) in
(* Now do something with each active descriptor *)
List.iter (fun fd ->
   let data = List.assoc fd foo in
   do_something data) outs

Here's how it would work with my proposed select2:

(* You have a data structure with some file descriptors and auxilliary data *)
let foo = [(fd1, "hello"); (fd2, "world"); (fd3, "baz")] in
(* Feed it to select2 *)
let _, outs, _ = Unix.select2 [] foo [] (-1.0) in
(* Now do something with the data *)
List.iter (fun (fd, data) -> do_something data) outs

The whole thing is much nicer.  You don't need a separate list of file
descriptors that exists only to be fed to select.  You don't need to
do any work afterwards to find the data associated with a file
descriptor.  The fact that it's also faster (assuming it truly *is*
faster) is just the icing on the cake.  Now I'll grant that I've
chosen a convenient data representation for my purposes, but it's not
like I've cheated badly -- a list of (fd, data) or (fd, closure) pairs
would be a sensible thing to use in this kind of code.

Now of course this could all be done at the ocaml level, but wouldn't
it be nicer to do the elegant thing in the first place instead of
cleaning up the mess afterwards?  ;^)


>>>-- Nathaniel Gray -- Caltech Computer Science ------>
>>>-- Mojave Project -- http://mojave.cs.caltech.edu -->