Version franaise
Home About Download Resources Contact us
Browse thread
The Implicit Accumulator: a design pattern using optional arguments
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Quc_Peyrot <chojin@l...>
Subject: Re: [Caml-list] Book about functional design patterns

On Jun 27, 2007, at 10:35 PM, Brian Hurt wrote:

> Quc Peyrot wrote:
>
>>
>> On Jun 27, 2007, at 9:48 PM, Brian Hurt wrote:
>>
>>> In Ocaml, allocations are relatively cheap- a cost similiar to  
>>> that  of allocating on the stack.  Which is why when you tell  
>>> long-time  Ocaml programmers that you want to avoid an allocation  
>>> cost by  allocating on the stack, they tend to go "um, why?"
>>>
>>> Mutable data structures have their cost as well.  When you assign  
>>> a  pointer into an object old enough to be in the major heap,  
>>> Ocaml  kicks off a minor collection.  For small N, this can often  
>>> make O (log N) purely functional structures faster than their O 
>>> (1)  imperative counterparts.
>>>
>>> No to mention the correctness advantages, plus other advantages.
>>
>>
>> If I have a tree/map datastructure and I add an element to it, my   
>> understanding it that, when building the new tree, all the node up  
>> to  the root are going to be replaced. Is my understanding correct?
>
> No.  Only those elements that change need to be reallocated- the  
> rest can be shared between the old and new tree.  So, assuming the  
> tree is a balanced tree, only the log N or so nodes between the  
> root and the changed node will need to be reallocated, of the N  
> nodes in the tree- so you're sharing N - log N nodes between the  
> two trees.

That's exactly what I meant with "all the nodes up to the root" (you  
need to change a leaf, for that you need to change its parent, but  
for that you ... and so on up to the root).

>> Now let's say I want to build a tree with millions elements, and  
>> I'm  only interested in the final result, i.e. I don't need to be  
>> able to  rollback to a previous state or fancy stuff like that (I  
>> therefore  never keep a reference to the root of the old tree,  
>> each time I add a  new element).
>> Is there a way of writing the building code (with a fold-type of   
>> construct or something like that) such that the compiler will   
>> understand it, and will generate a code equivalent to the  
>> imperative  code (i.e. we don't allocate new nodes up to the root  
>> each time we  insert an element)?
>>
> With a million element tree, you're only reallocating 20-30 nodes  
> (assuming it's balanced), and sharing the other 999,970-999,980  
> nodes between the two trees.  Allocating each new node is only  
> going to be a small number of clock cycles, like 5-10 clocks.  So  
> the total allocation cost is only 100-200 clocks or so.

I don't understand that part. Each time you add a node, log2(n) node  
needs to be re-allocated. Hence the total number of reallocations is:
Sum(ceil(log2(i)), i = 1..N) with N = 1000000, which is significant.
Where is my mistake in these maths?

-- 
Best Regards,
Quc