This site is updated infrequently. For up-to-date information, please visit the new OCaml website at ocaml.org.

[Caml-list] Set/Map with intervals and/or order-related operations
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
 Date: 2003-05-30 (12:00) From: Diego Olivier Fernandez Pons Subject: Re : [Caml-list] Set/Map with intervals and/or order-related operations
```    Bonjour,

> Perhaps I will develop my own since I need balancing schema, and a
> Map-like structure (not only a Set-like structure.) based on Diet.
> I only need queries for individual elements, not intervals. So I
> don't think a k-d tree is appropriate. Also I need the data
> structure purely functional, because the cost of copying is not
> negligible. So, even though several people points out the existence
> of C implementations, I will opt an ocaml one. The dom type of
> FaCiLe is essentially a list of intervals. Maybe I can use Baire
> implementations.

I am afraid I haven't yet understood your problem and I will need more

There are several points in your message I would like to comment :
- Tree-like data structure balancing
- Diet (discrete interval encoding scheme)

1. Balancing tree-like data structure

You should separate the design of the tree-like data structure and its
balancing.

First write down your unbalanced data structure and get your
algorithms right, only then choose one and implement one of the
multiple balancing schemes available

The trick is very simple (Stephen Adams) : encapsulate the balancing
information in << smart constructors >>.

At the beginning this constructors will be trivial

let makeTree = fun l v r -> N (l, v, r)

and your functions will look like this

let rec insert x = function
| E -> single x
| N (l, v, r) ->
match compare x v with
| n when n < 0 -> makeTree (insert x l) v r
| n when n > 0 -> makeTree l v (insert x r)
| _ -> failwith "already in the data structure"

Once everything goes right, you will add a balancing constructor and
replace your [makeTree] by the balancing constructor [makeBTree].

Sometimes you will find usefull not to recompute all the balancing
information at every node, then you just need to add a cache. This is
of course optional.

...
| E -> injection x
| N (l, v, r, _) -> (* the last int is a cache for the constructor *)
...

Your balanced function will look like this

let rec insert x = function
| E -> single x
| N (l, v, r, _) ->
match compare x v with
| n when n < 0 -> makeBTree (insert x l) v r
| n when n > 0 -> makeBTree l v (insert x r)
| _ -> failwith "already in the data structure"

Conclusion : You do not need to bother with balancing schemes from the
them very easely.

Baire contains a few examples :
- unbalancedSet.ml contains unbalanced trees and AVL-trees without
cache (does not change the data type) (0)
- avlSet.ml contains avl sets with cache (1)
- weighBalancedSet.ml contains WB sets with cache (2)
...

(1) and (2) are just a copy-paste of (0) with the described
modifications and (1) and (2) share all but 40 lines of code (namely
the 4 balancing constructors).

2. Discrete interval encoding tree

The idea of diet is to replace successive integers by an interval

{3} {5} {6} {7} {9} -> {3} [5-7] {9}

Then, a Diet is just a tree of intervals and singletons with some
join/separate instructions. Since I do not understand what you mean by
'map-like operations' I don't know if this is appropriate.

{3, 4} {5, 2} {6, 2} {7, 3} {9, 4} -> {3, 4} [5-6, 2] {7, 3} {9, 4}

This will work (I mean with a significant compression rate) only if
you can ensure that consecutive bindings will have very often the same
image.

Or are you binding sigletons and intervals to values ?

{3, 4} {[5-7], 2} {[6-10], 5} {9, 4}

find 5 myDataStructure -> [(5, 2); (5, 5)]

Are the intervals supposed to be disjoint ?

{3, 4} {[5-7], 2} {[8-10], 3}

find 5 myDataStructure -> (5, 2)

Or are you binding keys to multiple values

{3, {4}} {5, {2, 5}} {6, {2, 5}} {7, {2, 5}} {8, 5} {9, {4, 5}} {10,
{5}}

find 5 myDataStructure -> [(5, 2); (5, 5)]

Multidimensional data structure (k-d trees, range trees, cartesian
trees) may be what you are looking for. It depends highly on what you
mean by "set/map with intervals" and "order related operations".

Could you give a few examples of what you expect of such a data
structure ?

Diego Olivier

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

```