Version française
Home     About     Download     Resources     Contact us    
Browse thread
[Caml-list] Re: Common IO structure
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Vladimir N. Silyaev <vsilyaev@m...>
Subject: Re: [Caml-list] Re: Common IO structure
On Fri, May 07, 2004 at 11:11:17PM +0100, Benjamin Geer wrote:

> >This signature looks like a good starting point. However I would rather
> >separate this code to three different pieces
> 
> That seems fine to me.  I just wanted to give you a very rough idea of 
> what I had in mind; I was pretty sure you'd see a better way to design 
> it. :)
> 
> >Please note,  that write and read inherently separated in signatures,
> >it allows simpler interface, supports read/write only streams
> >and better feet common model, where read and writes are separated.
> >However, module couldn't implement both read and write signatures, if
> >it's required.
> 
> The main thing I wanted to point out was that there needs to be a way to 
> read data into a buffer from a non-blocking socket into a buffer, and 
> then write the data from the *same buffer* into another non-blocking 
> socket.  Then compact the buffer (move any unwritten data to the 
> beginning of the buffer) and start again, like in this loop:
> 
> let copy_fd in_fd out_fd =
>   let b = AsyncBuffer.create () in
>     try
>       while (true) do
>         AsyncBuffer.from_fd b in_fd;
>         AsyncBuffer.flip b;
>         AsyncBuffer.to_fd b out_fd;
>         AsyncBuffer.compact b
>       done
>     with End_of_file -> ()
> 
> Can that still be done if the read and write signatures are separated?
Surely, just a little twist:
module Block = struct
  module type NbRead = sig
    type t
    val nb_read:  t -> string -> int -> int -> int
  end
  module type NbWrite = sig
    type t
    val nb_write:  t -> string -> int -> int -> int
  end
end

Note, that both read or write are using supplied buffers, so zero
copy is possible.
module NbCopy (Src:Block.NbRead) (Dst:Block.NbWrite) = struct
  let run s d = 
   let blen = 4096 in
   let buf = String.create blen in
   let rec copy off = 
     let  off  = if off = blen then 0 else off in
     let  rlen = blen - off in
     let  n = Src.nb_read s buf off rlen in
     let  n = Dst.nb_write d buf off n in
     copy (off+n)
   in
   try copy 0 with
   End_of_file -> ()
end

> 
> The other thing that's important is that character encoder/decoders 
> would need to be able to read characters from one buffer and write them 
> to another buffer in a different encoding.  An encoder/decoder would 
> need to gracefully handle the case where it reads from a buffer 
> containing incomplete characters.  That's another reason for the 
> 'compact' function: you could read 10 bytes from a socket into a buffer, 
> and those 10 bytes could contain 9 bytes worth of complete UTF-8 
> characters; the 10th byte would be the first byte of a multi-byte 
> character.  You'd pass the buffer to an encoder/decoder, which would 
> read 9 bytes and write them into another buffer in a different encoding 
> (say UTF-16), leaving the last byte.  You would then call 'compact' to 
> move that byte to the beginning of the buffer, and repeat.
> 
> Is there a way to fit this approach into what you've proposed for 
> encoder/decoders?
That's more complicated. Basically what was initially proposed is for
blocking IO, where read (get) guaranteed to return result or fail,
and fail is terminal, filter is not required to support restart procedure.
This works for any type of blocking IO, however it fails for non blocking
IO where restart is common technique. Problem with restart, that it places
unbounded restrictions of filter, it should be able to handle incomplete
inputs and support backtracking.

However non blocking IO usually used as an alternative for threads, in which
case it might be beneficial just change control type and add support 
for non blocking IO into the signature by using explicit continuation 
passing. There is an example for get and put signature with CPS:
 val get: t -> (symbol -> unit) -> unit
 val put: t -> (unit -> unit) -> symbol -> unit

Advantage of using CPS style, is that state of the "filter" captured
by the compiler in a closure at the time of function application. Disadvantage
of CPS style is rather unusual look of code and cost of closure 
construction. Later could be significantly reduced by short circuiting
filters (encoder/decoder) by providing filter with has block IO as input
and stream of symbols as output. 

---
Vladimir

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners