Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
[Caml-list] [ANN] The Missing Library
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2004-04-29 (17:31)
From: james woodyatt <jhw@w...>
Subject: Re: [Caml-list] Re: Common IO structure
On 29 Apr 2004, at 08:31, Yamagata Yoriyuki wrote:
> Encoding could be stateful, so there would be no single representation
> of EOL. (*)  Ok, this is very unlikely case currently, but I think 
> there
> is an interesting encoding for Unicode which is fully stateful.  So,
> readlines() needs to fully aware of the encoding.

This transcoding I/O channel under discussion is required to contain 
internal state for other reasons.  With non-blocking I/O, an underlying 
transport may present only those octets that are ready for reading, 
which may leave a codepoint incomplete at the end of the currently 
received octets.  Even without non-blocking I/O, a read can be 
interrupted by a system signal event and still return less than the 
number of octets requested.  It is not sufficient to defer signal 
processing until after the read completes— sometimes (but not always), 
a signal explicitly means to abort reading immediately.

> My proposal is mainly for sharing common channel types among
> libraries, so that a user can pass a channel from a libraries to
> anonther withoug writing a glue code.  Since parsing endline, or
> loading the whole file into the string mainly occurs in the endpoint
> of IO, I do not think standardizing them are necessary for this
> purpose.
> I do not think standardizing the endpoint API is important, because I
> think that in the end, we will use only one library as the endpoint of
> IO.

Most of us.  Some of us have other concerns that I don't see anyone 
else trying to address.  At some point, probably soon, I will be 
writing a wrapper around OpenSSL.  I need non-blocking I/O.  I need to 
parse XML documents of unbounded length, which means using a SAX-like 
parser (I have that now).  I need to be able to parse an arbitrary 
number of XML documents simultaneously.  In potentially any of the 
legal Unicode transfer encodings.  And I need to be responsive to 
events in near real-time.

I have the "control inversion" nightmare from hell.  That's why I have 
forced myself to learn functional programming techniques.

An I/O library that I can use is simply not going to be something that 
can satisfy Richard's requirement that he be able to slurp a whole file 
into an application data structure with a single line of code.  So I'm 
writing my own.  Richard will be appalled by how it works.

So I'm watching this discussion with a certain bemused detachment: I 
wonder what new and improved API will be coming from this that I will 
still find inadequate for my tasks.

> (*) IIRC, RFC defines the endianness of UTF-16 is swapped in the
> middle of the stream, when "BOM" 0xfffe appears.

This is quite true.  Happens all the time, too.

j h woodyatt <>
markets are only free to the people who own them.
To unsubscribe, mail Archives:
Bug reports: FAQ:
Beginner's list: