Set union
[
Home
]
[ Index:
by date

by threads
]
[ Message by date: previous  next ] [ Message in thread: previous  next ] [ Thread: previous  next ]
[ Message by date: previous  next ] [ Message in thread: previous  next ] [ Thread: previous  next ]
Date:  20050225 (21:50) 
From:  Radu Grigore <radugrigore@g...> 
Subject:  Re: [Camllist] Complexity of Set.union 
On Fri, 25 Feb 2005 19:47:45 +0000, Jon Harrop <jon@jdh30.plus.com> wrote: > I ask this because the STL set_union is probably O(n+N) (inserting an already > sorted range into a set is apparently linear) which is worse than the O((n+N) > log(n+N)) which you've suggested for OCaml. The complexity of set_union is indeed O(n+N), see [0]. It is basically a merge of sorted _sequences_ [1]. I assume n is the size of the small set and N is the size of the small set and the heights are h=O(lg n), H=O(lg N). With this the complexity of Set.union is more like O(n lg(n+N)), at least when all elements in one set are smaller than the elements of the other set. > I see. This could be improved in the unsymmetric case, by adding elements from > the smaller set to the larger set. But the size of the set isn't stored so > you'd have to make do with adding elements from the shallower set to the > deeper set. I've no idea what the complexity of that would be... That is how it works now. As Xavier said the trickiest part is split. > > Did you mean "of two equal height sets such that all elements of the > > first set are smaller than all elements of the second set"? > > Yes, that's what I meant. :) In that case the current Set.union simply adds elements repeatedly from the set with small height to the set with big height. > > That > > could indeed run in constant time (just join the two sets with a > > "Node" constructor), but I doubt the current implementation achieves > > this because of the repeated splitting. What splitting? I see none in this case. > Having said that, wouldn't it take the Set.union function O(log n + log N) > time to prove that the inputs are nonoverlapping, because it would have to > traverse to the min/max elements of both sets? I agree. Also, such a check looks ugly to me (for a standard library).  regards, radu http://rgrig.blogspot.com/ [0] http://library.n0i.net/programming/c/cpiso/libalgorithms.html#lib.set.union [1] http://rgrig.blogspot.com/2004/11/merginglists.html