Version française
Home     About     Download     Resources     Contact us    
Browse thread
Ropes and rope-like functional extensible vectors with O(1) prepend/append.
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Jon Harrop <jon@f...>
Subject: Re: [Caml-list] Ropes and rope-like functional extensible vectors with O(1) prepend/append.
On Monday 30 July 2007 12:51:29 Mauricio Fernandez wrote:
> The structure is persistent and all operations are non-destructive.
> Mutation is used only in the rebalancing operation and in set, but they
> affect an ephemeral forest of ropes/vects and a new string/array
> respectively, so the original structure is never modified and in principle
> all operations should be thread-safe.
>
> Ropes/vects being functional doesn't mean that they will perform well in a
> persistent setting however, see the clarification at
>
> http://eigenclass.org/repos/oropes/head/doc/Vect.html

Ok.

> > I have some suggestions:
> >
> > I'd like metadata in every node, so I can provide a constructor that
> > combines the metadata of child nodes and a cull function to accelerate
> > searches.
>
> If I understand it correctly, that scheme could in the limit turn some O(n)
> searches into O(log n)?), right?

Something like that, yes.

> Unlike Vec, Vect uses "compact" leaves (Leaf of 'a array) of bounded size
> (leaf_size, typically 16-64), which might not fit very well.

I can think of two solutions:

1. Linear search.

2. Store an array of metadata corresponding to a binary search of the array of 
elements in a leaf.

The latter sounds sexy but the former would probably suffice. :-)

> Vect would need to know how to combine the metadata, wouldn't it?

Users would have to provide a function to do that, yes.

> I was 
> thinking about something like
>   Leaf of ('meta -> 'meta -> 'meta) * 'meta * 'a array
> but I've realized that this wouldn't suffice. So, given that Vect.t is
> currently
>
>     type 'a t =
> 	Empty
>
>       | Concat of 'a t * int * 'a t * int * int
>       | Leaf of 'a array
>
> something like this maybe?
>
>   type ('a, 'meta) t =
>   	Empty of ('meta -> 'meta -> 'meta)
>
>       | Concat of ('meta -> 'meta -> 'meta) * 'meta *
>
>                   ('a, 'meta) t * int *
>                   ('a, 'meta) t * int *
> 		  int
>
>       | Leaf of ('meta -> 'meta -> meta) * 'meta array * 'a array
>
> 		(* maybe also  * 'meta  to cache the last computation? *)
>
> or even without the ('meta -> 'meta -> 'meta) part, forcing the user to
> pass the function on each modification?  Just thinking out loud.

I would use a functor to pass it the "combine" and "cull" functions, rather 
than putting them in the tree itself. This is analogous to the Set.Make 
functor accepting a comparison function.

> At any rate, it'd be better to provide it as a separate structure, any
> suggestions for the name?.

You could just call it Tree and try to make it as generic as possible.

You could then reimplement the Set module on top of your data structure by 
searching for the index of the given element and inserting it if it is new. 
Hmm, maybe a function that combines a search with an update (to avoid two 
traversals) would be a good idea...

> > The usual HOFs, like map.
>
> I just pushed a patch with filter and map. The former is trivially
> implemented with fold + append (thanks to the O(1) append). I was going to
> code map the same way but I ended up making one that returns an isomorphic
> vect and is faster (since there's no need to rebalance).

Yes. You could also use recursive subdivision to create a perfectly balanced 
result.

> So Vect currently has iter, iteri, rangeiter, fold, map and filter. I'm
> considering renaming fold to fold_left and providing fold_right too.

I'd definitely provide both folds, yes.

I'd also like specialized rewrite functions that avoid allocations when the 
outputs are the same as the inputs. So I'd make the set function bail via an 
exception when the element is left unchanged, returning the original Vect and 
doing no allocation in this case. I'd also like an id_map function that did a 
map but reused old branches whenever they were left unchanged.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
OCaml for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists/?e