This site is updated infrequently. For up-to-date information, please visit the new OCaml website at ocaml.org.

Set union/inter/diff efficiency
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
 Date: 2005-07-27 (16:04) From: james woodyatt Subject: Re: [Caml-list] Set union/inter/diff efficiency
```On 27 Jul 2005, at 02:12, Jon Harrop wrote:
>
> Does anyone have any ideas or references on how the union/inter/
> diff functions
> of the Set module could be optimised by accepting a sequence of
> sets rather
> than a pair at a time? For example, if A overlaps B overlaps C but
> A does not
> overlap C then it is probably quicker to compute the union "(A U C)
> U B"
> rather than "A U B U C".
>
> Better still, does anyone have a replacement Set module which
> implements this
> functionality?

No, but you could maybe make an extension more easily using my OCaml
NAE core foundation library.

Here is the pseudo-code for set union that I would try:

Make a heap of sets [Cf_heap.of_seq].
Map into a sequence of sets [Cf_heap.to_seq].
Map into a sequence of element sequences [Cf_seq.map,
Cf_set.to_seq_incr].
While queue is not empty,
Take an element sequence from the queue.
Take an element from the head of the sequence.
If there is no output yet, or the element is greater than
current output, then
Output the element
If the element sequence tail is not empty, then
Push the element sequence tail onto the queue
End while

You could do similar things for difference and intersection.

I'm not optimistic that this will actually improve performance.
Beating the implementation in the standard library is tricky and
harder than one might think.

--
j h woodyatt <jhw@wetware.com>
markets are only free to the people who own them.

```