Balancing algorithm of Set/Map implementation

Yoriyuki Yamagata
 Julien Signoles
 Daniel_BÃ¼nzli
 Goswin von Brederlow
 StanisÅ‚aw T. Findeisen
[
Home
]
[ Index:
by date

by threads
]
[ Message by date: previous  next ] [ Message in thread: previous  next ] [ Thread: previous  next ]
[ Message by date: previous  next ] [ Message in thread: previous  next ] [ Thread: previous  next ]
Date:   (:) 
From:  Goswin von Brederlow <goswinvb@w...> 
Subject:  Re: [Camllist] Balancing algorithm of Set/Map implementation 
Yoriyuki Yamagata <yoriyuki.y@gmail.com> writes: > Hi, list. > > When I read the balancing function of stdlib's Set/Map several years ago, I > thought I have understand how it works. But now, I read it again and I'm less > confident now. Could someone answer my questions? Here is the snippet of the > code. > > let bal l v r = > let hl = match l with Empty > 0  Node(_,_,_,h) > h in > let hr = match r with Empty > 0  Node(_,_,_,h) > h in > if hl > hr + 2 then begin > match l with > > Empty > invalid_arg "Set.bal" >  Node(ll, lv, lr, _) > > if height ll >= height lr then > create ll lv (create lr v r) > else begin > match lr with > > Empty > invalid_arg "Set.bal" >  Node(lrl, lrv, lrr, _)> > create (create ll lv lrl) lrv (create lrr v r) > end > end else if hr > hl + 2 then begin > > match r with > Empty > invalid_arg "Set.bal" >  Node(rl, rv, rr, _) > > if height rr >= height rl then > create (create l v rl) rv rr > else begin > > match rl with > Empty > invalid_arg "Set.bal" >  Node(rll, rlv, rlr, _) > > create (create l v rll) rlv (create rlr rv rr) > end > > end else > Node(l, v, r, (if hl >= hr then hl + 1 else hr + 1)) > > I have two question. > >  Node(ll, lv, lr, _) > > if height ll >= height lr then > create ll lv (create lr v r) > > else begin > > Is this code right? If r is Empty and lr and ll are huge trees, > doesn't it create a massively unbalanced tree? If r is empty then lr and ll can not be huge. Otherwise the tree was massively unbalance beforehand. The balancing prevents this from hapening. > Another question is that why OCaml implementation allows > a balancing factor up to *2*, which is usually allowed only up to 1? Probably avoids having to do 2 balancings in a single operation. Or weighs the number of balancing done on average against a slightly unbalanced tree, i.e. turns out to be faster to be more unbalanced in practice. > Maybe my question is naive one, but I would appreciate if your could comment it. > > Regards, MfG Goswin