Re: Can someone explain?

From: skaller (skaller@maxtal.com.au)
Date: Tue Oct 05 1999 - 00:57:20 MET DST


Date: Tue, 05 Oct 1999 08:57:20 +1000
From: skaller <skaller@maxtal.com.au>
To: Pierre Weis <Pierre.Weis@inria.fr>
Subject: Re: Can someone explain?

Pierre Weis wrote:
>
> > Is there a way to access a value of a class instance?
> > If so, what is the syntax? If not, what is the purpose
> > of allowing 'val' bindings in class declarations in module
> > signatures?
>
> The only way to access a value of a class instance is via method
> invocation: you have to define a method that returns the value.

        Thanks, but you have not answered the real question here:
WHY are the values present in the interface when they are not accessible
via the interface?
 
> > Similarly, what is the purpose of allowing 'virtual'
> > methods in class types and class declarations in
> > module signatures?
>
> Virtual methods are methods that are declared but not implemented:
> sub-classes must define them.

        Again, I knew that, the real question is WHY this information
is in the _interface_??

> > I have a doubly linked list class, and concatenating
> > two lists takes 100 times longer in ocaml using my
> > code than pythons list concatenation function,
> > which is written in C.
>
> Wao! I would like to be able to run your benchmark, since I will be
> glad to try to use the Python's way of handling lists to speed up the
> Caml list package by a factor of 100!

        Sigh. I have just examined the Python implementation and
the reason is that Python uses arrays for lists. This makes
insertion, deletion, and concatenation O(n), but it uses
fast machine primatives (memcpy) and does only one allocation.
My doubly linked list is the same order for appending,
but it laboriously copies each node one by one, and has to do an
allocation
for each one, and has to search for the i'th location as well.
So it is much slower than the Python equivalent.

        I will change the representation to match Python, using an
array (or perhaps a doubly linked list of arrays), and report back.

> We worked hard on this point,
> and Caml lists (and more generally similar data structure
> manipulations) are supposed to be among the fastest known
> implementations available. So, could you please send a complete
> reproducible benchmark ?

        You would need the WHOLE interpreter :-)
I will make that available in the near future, asking for help
to speed up the implementation.

        I think it would be VERY useful to have an ocaml written
Python interpreter/compiler as fast as, or faster than, CPython.
There are a lot of Python users out there who could be introduced
to ocaml this way, and gain immediate benefits from a faster
implementation (particularly the compiler).

> As you suggested, we may add extra list manipulation modules in the
> standard library, such as mutable lists or doubly linked
> lists. Mutable lists are easy to implement and there is no doubt that
> many Caml programmers have already written such kind of packages and
> can contribute.

        Excellent! This would be an ideal situation.

        The other thing I would beg for is to fix the parser
and lexer to be re-entrant [purely functional]. At present, state
information
must be kept in global store :-(

        It would also be useful if the parser could support nested
parser calls as the lexer can: this is possible with the lexer,
because the lexbuf is always in a state to read the next lexeme,
however it is not possible with the parser because it may have read
a token ahead of the production being reduced.

> About your own program, I cannot comment on efficiency, since I do not
> catch the general idea of the design: I am a bit surprised by the
> melting of object oriented features and regular algebraic data type
> manipulation to implement lists. Could you please explain a bit the
> logic of your code (in particular the only comment
> (* STL style mutators *) is a kind of puzzle for me) ?

Sorry for brevity. STL is the C++ Standard Template Library,
it contains efficient codes for manipulating data structures
with generic algorithms, using iterators (i.e. pointer like things).

> > module DoublyLinkedList : BiDirectional =
> > struct (* doubly linked list type *)
> > type 't d_node =
> > {
> > mutable nxt: 't iterator;
> > mutable prv: 't iterator;
> > mutable data: 't
> > }

A doubly linked list has a node containing two pointers,

        nxt -- a pointer to the next node object
        prv -- a pointer to the previous node object
        data -- the data

        
> > and 't iterator =
> > Empty
> > | Node of 't d_node

        This is the type of a pointer to a node. In ocaml,
references are used for pointers, so the type of a pointer is
just that of a node, except that the pointer can advance
'off the end of the list' and so can refer to no node,
and thus the two cases 'Empty' for off the end, and Node
for pointing to an actual node.

> > let next x = match x with Empty -> Empty | Node n -> n.nxt
> > let deref x = match x with Empty -> None | Node n -> Some n.data

        next gets a pointer to the next node, given a pointer to some
node x. deref gets the data, given an iterator. in C++:

        x = ++x // next
        *x // deref

is the notation used, however the implementation is

        x->nxt
        x->data

> > class ['t] dlist =
> > object(self)
> > val mutable first': 't iterator = Empty

A class is used here to represent the whole list. It is an object
simply because that is the intended use: as a mutable object
with various mutators. In retrospect, this is probably a mistake.

-- 
John Skaller, mailto:skaller@maxtal.com.au
1/10 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
downloads: http://www.triode.net.au/~skaller



This archive was generated by hypermail 2b29 : Sun Jan 02 2000 - 11:58:26 MET