English version
Accueil     À propos     Téléchargement     Ressources     Contactez-nous    

Ce site est rarement mis à jour. Pour les informations les plus récentes, rendez-vous sur le nouveau site OCaml à l'adresse ocaml.org.

Browse thread
[Caml-list] [ANN] The Missing Library
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2004-05-01 (03:38)
From: skaller <skaller@u...>
Subject: Re: [Caml-list] [ANN] The Missing Library
On Sat, 2004-05-01 at 01:58, Brian Hurt wrote:

> Just for the record, you *can* do this just fine in Ocaml.  Ext-lib is 
> already doing this.  This isn't a limitation of the language, it's a 
> feature lack of the core library.

This isn't quite correct. Extlib enums are not the same
as iterators. However it is true that an important part
of the iterator concept is captured in Extlib.

basic C++ iterators have 3 functions: dereference, advance,
and compare-equal with strict invalidation conditions.

More advanced iterators support total order comparisons,
and allow copies to be held (as well as possibly reverse movement
and random access)

Extlib enums use a quite different concept. In particular
C++ iterators have very strict interpretation of validity
and copyability which Extlib does not specify.

Also enums do not support positional comparisons, or subranging
using a pair of iterator.

In addition C++ iterators support insertion, using an 
iterator to find a position in a container, and numerous
other properties all of which are directly related to their
origins conceptually as pointers into arrays.

> I don't think having lots of different types of iterators is all that 
> usefull.  Once you get beyond a simple linear walk through the data 
> structure, the nature of the data structure becomes important.

They're vital. The kinding of iterators as mentioned above
is fundamental. You may be right that, for example, bidirectional
or random iterators are less useful, but the distinctions
between forward, input, and output iterators are crucial,
and the lack of that distinction in Extlib effectively
breaks the library. I won't touch it until this is fixed.
Indeed, there is work in the C++ committee to FURTHER
classify iterators, particularly in respect of
mutability of the container and buffering of the
dereference operation.

Basically: a forward iterator is a position in a container
which lives in memory. Provided you don't modify the
container, iterators remain valid. Iteration does
not consume the container in any way. 

Input iterators can be copied but only one is valid
after advancing (the one you advanced). They are
used to operate on streams where advancing requires
a destructive operation.

In type theoretic terms, something like:
forward iterators work on inductive data types, 
input iterators on coinductive data types.

The difference is utterly fundamental.

Extlib tries to present a common interface for
input and forward enumerations. 

This is inconsistent. What needs changing may
well ONLY be documentation, I'm not sure.
Enums, like iterators, are intrinsically unsafe
to use. The conditions under which the 
interface will yield stated results must be
specified *precisely and pedantically*.

And it is NOT easy to do so. Witness STL.

To give an example: suppose you have two
stream handles and apply a double Extlib.Enum.iter2
on them. By specification, that causes a force

It seems clear, right? 

It isn't. What happens if its the SAME enumeration?

In C++ STL this is handled because there is
a STRICT interpretation of an algorithm like:

  while(p!=e) cout << *p++ << *q++;

Here ISO C rules make it clear that the next
two elements of the input sequence get
printed in an arbitrary order, then repeat.

Extlib iter makes no guarrantees at all.
It might well crash immediately because it
forces one of the handles completely first,
so the second one is already exhausted
because its the same iteration : this is
sure to be an issue if the enumeration
is a generator function.

Note in the C++ example it is trivial to
*enforce* an ordering:

  while(p!=e) { cout << *p++; cout  << *q++; }

Now we know the output of a single generator
will be strictly ordered.

Note that this assumes we have a handle!
If it isn't a handle kind of iterator,
then p++ immediately invalidates q,
and so q++ is undefined.

So specifying the semantics of the above
algorithm for an STL iterator is 
VERY HARD! It definitely requires a sophisticated
classification scheme for iterators.

There's no way around this: there is a compromise
between a complex set of iterator abstractions,
and simply accepting 'undefined behaviour' as
a specification for an algorithm.

Extlib can easily err on the side of 'undefined'.
But it must be pedantic about what IS defined
and what is not or there is no way to trust
it to integrate memory containers and streams.

Last I looked a symptom of the design fault
is found in the 'fast_count' function.
Whilst  that exists, the library is necessarily
flawed. Arguments about 'its useful' do not hold
any water: it has to be justified in terms
of a specified abstraction or thrown out.

At least part of the problem here is that
Ocaml is primarily a functional language,
and functional languages can't handle
stateful programming easily. Iterators are
intrinsically stateful: thats the whole point
of them.

John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850, 
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net

To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners