Version française
Home     About     Download     Resources     Contact us    
Browse thread
[Caml-list] [ANN] The Missing Library
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Benjamin Geer <ben@s...>
Subject: Re: [Caml-list] Re: Common IO structure
John Goerzen wrote:
> I'm looking at java.io right now.  I count no less than 10 interfaces
> and 50 classes.  Let's say that I want to open up a file for read/write
> access and be able to seek around in it.  Looking at the class list, I
> don't know if I want BufferedInputStream, BufferedOutputStream,
> BufferedReader, BufferedWriter, CharArrayReader, CharArrayWriter,
> DataInputStream, DataOutputStream, File, FileDescriptor,
> FileInputStream, FileOutputStream, FileReader, FileWriter, InputStream,
> InputStreamReader, OutputStream, OutputStreamWriter, RandomAccessFile,
> Reader, or Writer.  Really, I literally *do not know how to open a
> simple file*.  I would not call that intuitive.

You actually have to *read* the documentation, not just glance at the 
class names. :)  That's to be expected with a powerful API.  Once you 
understand the key concepts governing the design of the API, it makes 
sense, it and becomes intuitive to select the classes you need.  I tried 
to point out these concepts in the message you replied to.

To read a file containing UTF-8 text, one line at a time:

BufferedReader in =
     new BufferedReader
     (new InputStreamReader
      (new FileInputStream(filename), "UTF8"));

while (true)
{
     String line = in.readLine();

     if (line == null)
     {
         break;
     }

     System.out.println(line);
}

This illustrates the main design concept I was talking about. 
InputStream is an abstract class; different implementations either know 
how to get input from a particular source (like FileInputStream), or are 
meant to be used as wrappers around another InputStream to add 
functionality (like buffering).  All the classes whose names end in 
'Stream' deal with bytes only; the ones whose names end in 'Reader' or 
'Writer' deal with characters.  See?  It's easy once you know the pattern.

To open a file for read/write access and be able to seek around in it:

RandomAccessFile file = new RandomAccessFile(filename, "rw");

The methods in RandomAccessFile are pretty self-explanatory.

Ben

>>OK, but then you can leave out readline(), readlines() and xreadlines(), 
>>because they don't make any sense unless you've already dealt with 
>>character encodings.
> 
> No, they can simply be implemented in terms of read().

A line is a chunk of text, not of bytes.  I don't think it makes sense 
to deal with text unless you know what encoding it's in.

>>Then, before you can divide text into lines, you also need to know which 
>>newline character(s) to use.  This needs to be configurable 
>>programmatically
> 
> That's pretty easy as a class variable

I was only pointing out that neither Python (as far as I can tell) nor 
Java do this.

Ben

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners