Re: string vs vect + GC question

Xavier Leroy (Xavier.Leroy@inria.fr)
Mon, 19 Feb 1996 11:06:31 +0100 (MET)

From: Xavier Leroy <Xavier.Leroy@inria.fr>
Message-Id: <199602191006.LAA04914@pauillac.inria.fr>
Subject: Re: string vs vect + GC question
To: Jocelyn.Serot@lasmea.univ-bpclermont.fr (Jocelyn Serot)
Date: Mon, 19 Feb 1996 11:06:31 +0100 (MET)
In-Reply-To: <199602161428.PAA10575@concorde.inria.fr> from "Jocelyn Serot" at Feb 16, 96 03:27:09 pm

Concerning your performance comparison between arrays and strings:

> let s = make_string 65536 `a`;;
> let f n =
> for i = 0 to n do for j = 0 to n do s.[i*256+j] <- s.[i*256+j] done done;;
> let v = make_matrix 256 256 `a`;;
> let g n =
> for i = 0 to n do for j = 0 to n do v.(i).(j) <- v.(i).(j) done done;;

> COMMENT: For 2D arrays, strings win the game. This is quite surprising
> since Caml Light is not supposed to be very good in doing the
> explicit arithmetic involved in the index computation. Vects seem
> to be penalized by the double indirection.

The main reason why arrays of arrays are less efficient is that each
access performs two bound checks, instead of one for the string-based
version. With array bound checking turned off (option -O fast), both
versions run at the same speed.

Also, if you're really doing matrices of bytes, the string-based
encoding is 4 times more compact (8 times on an Alpha) than the
array-based representation, which improves greatly cache behavior and
I/O speed.

As for your memory management question:

> I have a basic question regarding the behaviour of the Caml-Light Garbage
> collector: when an object becomes hidden, because of a its name is rebound at
> the top level, is it automatically reclaimed by the gc ?...
> For example:
> #let m = make_vect 1024 0;;
> m : int vect = [|0; 0; 0; 0; 0; ... |]
> #let m = 0;;
> m : int = 0 (* <- Will the space allocated for the vector be reclaimed ?*)

No, it will not be reclaimed. That's because all toplevel definitions
are entered in a global table of values, and always referenced through
that table. Rebinding an identifier does not guarantee that its
previous value is no longer needed, e.g.:

let m = make_vect 1024 0;;
let f x = m.(x);;
let m = 0;;

"f" still accesses the previous value of m in the global table.

Note that this behavior is specific to the toplevel: locally-bound
structures will always become garbage when leaving the scope of their
definition.

- Xavier Leroy