Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
[Caml-list] Efficient and canonical set representation?
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2003-11-06 (17:03)
From: Brian Hurt <bhurt@s...>
Subject: Re: [Caml-list] Efficient and canonical set representation?
On Thu, 6 Nov 2003, Harrison, John R wrote:

> Does anyone know a representation of finite sets over an orderable polymorphic type
> that's (1) efficient and (2) canonical? Even better would be a CAML or OCaml
> implementation. More precisely I'm looking for:
>   1. Log-time lookup and insertion, and linear-time union, intersection etc.
>   2. Equal sets are represented by the same object.

Two is the tricky one to implement.  Imagine a case where I have set A 
with it's elements, and set B with all the elements less one of set A, but 
inserted in a different order.  B is a different object than A (the two 
sets are not equal).  Now you add that one last element from A, you want 
the insert routine to return A.  This means that the insert routine has to 
know that A exists, and has to compare the new B to A to determine that it 
should return A and not B.  It can be done but it's not trivial.

Games with structure definitions don't help, because Ocaml will happily
allocate different structures with the same data (this is why 1. == 1. is
false).  With a balanced tree structure you can implement the naive
equality comparison in linear time (the sequence i/2^i converges, allowing
you enumerate the elements in linear time).  If you need faster (average) 
compares, there are a number of short cuts you can do.  For example, you 
can keep the number of elements currently in the set handy, and if the 
number of elements don't match, obviously the sets won't be equal.  
Fancier, you can also keep a hash of all elements in the set- the hashs 
aren't equal, you can gaurentee the sets aren't equal.  Be carefull with 
defining your hash function so the order elements were added isn't 


To unsubscribe, mail Archives:
Bug reports: FAQ:
Beginner's list: