Browse thread
Balancing algorithm of Set/Map implementation
-
Yoriyuki Yamagata
- Julien Signoles
- Daniel_Bünzli
- Goswin von Brederlow
- Stanisław T. Findeisen
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Goswin von Brederlow <goswin-v-b@w...> |
| Subject: | Re: [Caml-list] Balancing algorithm of Set/Map implementation |
Yoriyuki Yamagata <yoriyuki.y@gmail.com> writes:
> Hi, list.
>
> When I read the balancing function of stdlib's Set/Map several years ago, I
> thought I have understand how it works. But now, I read it again and I'm less
> confident now. Could someone answer my questions? Here is the snippet of the
> code.
>
> let bal l v r =
> let hl = match l with Empty -> 0 | Node(_,_,_,h) -> h in
> let hr = match r with Empty -> 0 | Node(_,_,_,h) -> h in
> if hl > hr + 2 then begin
> match l with
>
> Empty -> invalid_arg "Set.bal"
> | Node(ll, lv, lr, _) ->
> if height ll >= height lr then
> create ll lv (create lr v r)
> else begin
> match lr with
>
> Empty -> invalid_arg "Set.bal"
> | Node(lrl, lrv, lrr, _)->
> create (create ll lv lrl) lrv (create lrr v r)
> end
> end else if hr > hl + 2 then begin
>
> match r with
> Empty -> invalid_arg "Set.bal"
> | Node(rl, rv, rr, _) ->
> if height rr >= height rl then
> create (create l v rl) rv rr
> else begin
>
> match rl with
> Empty -> invalid_arg "Set.bal"
> | Node(rll, rlv, rlr, _) ->
> create (create l v rll) rlv (create rlr rv rr)
> end
>
> end else
> Node(l, v, r, (if hl >= hr then hl + 1 else hr + 1))
>
> I have two question.
>
> | Node(ll, lv, lr, _) ->
> if height ll >= height lr then
> create ll lv (create lr v r)
>
> else begin
>
> Is this code right? If r is Empty and lr and ll are huge trees,
> doesn't it create a massively unbalanced tree?
If r is empty then lr and ll can not be huge. Otherwise the tree was
massively unbalance beforehand. The balancing prevents this from
hapening.
> Another question is that why OCaml implementation allows
> a balancing factor up to *2*, which is usually allowed only up to 1?
Probably avoids having to do 2 balancings in a single operation. Or
weighs the number of balancing done on average against a slightly
unbalanced tree, i.e. turns out to be faster to be more unbalanced in
practice.
> Maybe my question is naive one, but I would appreciate if your could comment it.
>
> Regards,
MfG
Goswin