Version française
Home     About     Download     Resources     Contact us    
Browse thread
RE: reference initialization
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Pierre Weis <Pierre.Weis@i...>
Subject: Re: reference initialization
I think you're right to consider proper initialization of vectors as a
desirable property.

I also agree with you that initializing a vector with a dummy value is
not a good practice and not an acceptable solution. However, the
Java's way of testing further accesses to elements of vectors does not
seem to be more appealing, since it does not ensure proper
initialization of vectors, and thus spurious exceptions, due to access
into improperly initialized vectors, can occur at anytime after the
creation of the faulty vector.

The Caml way to handle this kind of problems is to provide automatic
means to ensure maximum safety, in this case we must warranty proper
initialization of vectors. I introduced Array.init to address this
issue; however, as we already know, Array.init has some limitations;
in the following, I propose other initialization functions to overcome these
limitations, while still providing safety of vector initializations.

1) Array.init
-------------

Array.init is simple to use and convenient for simple initializations
(when the value to assign to location i can be expressed as a function
of i). Safety is provided by properties automatically ensured by
Array.init, such as:

If v is defined as ``let v = Array.init f n'' then

  1) v is completely initialized:
     for all i, v.(i) has been assigned exactly once during the
     initialization of v 
  2) elements of v are values computed by f:
     for all i, v.(i) = f (i)

  In addition, if v is defined with Array.init there is no need to
  perform runtime tests when accessing items inside vector v: v is
  properly initialized.

2) New initialize function
--------------------------

When you need more complex initializations, then you need a smarter
version of Array.init (still you want to keep the same good properties
as the simple version).

For instance, a simple generalization of Array.init, that allows
access to the locations that are already initialized into the array,
could be written as:

  exception Uninitialized_access of int;;

  let initialize f n =
   if n = 0 then [||] else
   let get v i j =
    if j < i && j >= 0 then v.(j) else raise (Uninitialized_access j) in
   let v = Array.make n (f (get [||] 0) 0) in
   for i = 1 to n - 1 do
    v.(i) <- f (get v i) i
   done;
   v;;

Using initialize, we can easily define a vector containing the
fibonacci numbers:

  let init_fib g = function
    | 0 | 1 -> 1
    | n -> g (n - 1) + g (n - 2);;

  let fib10 = initialize init_fib 10;;
  val fib10 : int array = [|1; 1; 2; 3; 5; 8; 13; 21; 34; 55|]

Initialize provides the following properties:

If v is defined as ``let v = Array.initialize f n'' then

  1) v is completely initialized:
     for all i, v.(i) has been assigned exactly once during the
     initialization of v
  2) not yet initialized locations in v are never accessed during the
     initialization process:
     for all i, for all j, if j > i then the location j is not accessed during
     the initialization of v.(i)
  3) elements of v are values computed by f:
     for all i, v.(i) = f (fun i -> (Array.sub v 0 (i - 1)).(i)) (i)

  Once more there is no need to dynamically check accesses to a vector
  initialized unsing the function initialize.

Evidently, if you want to initialize the other way round, you may
define the corresponding function, say initialize_down:

  let initialize_down f n =
   if n = 0 then [||] else
   let get v i j =
    if j > i && j < n then v.(j) else raise (Uninitialized_access j) in
   let v = Array.make n (f (get [||] 0) 0) in
   for i = n - 1 to 1 do
    v.(i) <- f (get v i) i
   done;
   v;;

3) New random_initialize function for ``random'' initialization
----------------------------------------------------------------

Now, you can still argue that you have even more complex
initialisation schemes: then you need to define a more complex
initialization function. For instance:

[random_initialize f i0 n] returns a vector [v] of length [n],
initialized using the function [f], starting from index [i0]. The
function [f] takes two arguments, an access function [g : int -> 'a]
that gives access to already initialized elements of the vector [v],
and an index [i]; given those arguments [f] returns a pair [next, a],
where [a] is the value to be stored at index [i] of vector [v], and
[next] the index of the next element of [v] to initialize. [f] should
raise the predefined exception [Exit] to mean that index [i] is out of
range, presumably at the end of initialization.

The following properties hold:

If v is defined as ``let v = random_initialize f i0 n'' then

  1) v is completely initialized:
     for all i, v.(i) has been assigned exactly once during the
     initialization of v
  2) not yet initialized locations in v are never accessed during the
     initialization process:
     for all i, for all j, if v.(j) has not yet been assigned then the
     location j is not accessed during the initialization of v.(i)
  3) all elements of v are values computed by calls to f.

Implementation

  (* Already_initialized is raised when there is an attempt to
     initialize an array location more than once. *)
  exception Already_initialized of int;;

  (* Partial_initialization is raised when there is a location that is
     not initialized at the end of initilization. *)
  exception Partial_initialization of int;;

  let random_initialize f i0 n =
   if n = 0 then [||] else
   let initialized = Array.make n 0 in
   let set v i a =
    if initialized.(i) = 0 then (initialized.(i) <- 1; v.(i) <- a)
    else raise (Already_initialized i) in
   let get v j =
    if j >= 0 && j < n && initialized.(j) = 1 then v.(j)
    else raise (Uninitialized_access j) in
   let nexti0, a0 = f (get [||]) i0 in
   let v = Array.make n a0 in
   set v 0 a0;
   let rec init i =
    let nexti, ai = f (get v) i in
    set v i ai;
    init nexti in
   try init nexti0 with
   | Exit ->
      for i = 0 to n - 1 do
       if initialized.(i) = 0 then raise (Partial_initialization i)
      done;
      Array.init n (fun i -> v.(i));;

Example: we can define a suitable initialization function for your
example of combinatorial numbers,

  let init_chooses n get i =
   let next i j = if j <> i then j else n + 1 in
   let ai i = get (i - 1) * (n - i + 1) / i in
   if i = 0 then (next 0 n, 1) else
   if i = n then (next n 1, 1) else
   if i > n then raise Exit else 
   if i <= n / 2 then (next i (n - i), ai i)
   else (next i (n - i + 1), get (n - i));;

  let chooses n = random_initialize (init_chooses n) 0 (n + 1);;

  let v4 = chooses 4;;
  val v4 : int array = [|1; 4; 6; 4; 1|]

4) Further work
---------------

The general vector initialization function should probably take as
argument f a function that uses 2 functions get and set to handle the
initialized vector. In any case, the properties 1, 2, and 3 should
still hold.

5) Conclusion
-------------

I'm not sure we need a complex additional machinery to solve a problem
that can be elegantly solved with some library functions (admittedly
at the price of some linear additional cost at initialization time);
however, following Xavier, I would also be glad to read articles on
dependant type systems and their applications to vector initialisation
(if any); also, I would be delighted to try such an experimental type
system, if it is tractable in practice (I mean, we don't want a type
system that obliges the programmer to rewrite his pattern matchings in
a strange and awful maneer, nor a type system that emits error
messages that are incredibly more difficult to understand than those
of usual ML typecheckers, nor a type system that is so slow that
modern machines are turned into good old Apple IIs, don't we ?).

Best regards,

-- 
Pierre Weis