Re: reference initialization

From: Pierre Weis (Pierre.Weis@inria.fr)
Date: Mon May 15 2000 - 23:33:17 MET DST

  • Next message: Patrick M Doane: "Re: RE: reference initialization"

    I think you're right to consider proper initialization of vectors as a
    desirable property.

    I also agree with you that initializing a vector with a dummy value is
    not a good practice and not an acceptable solution. However, the
    Java's way of testing further accesses to elements of vectors does not
    seem to be more appealing, since it does not ensure proper
    initialization of vectors, and thus spurious exceptions, due to access
    into improperly initialized vectors, can occur at anytime after the
    creation of the faulty vector.

    The Caml way to handle this kind of problems is to provide automatic
    means to ensure maximum safety, in this case we must warranty proper
    initialization of vectors. I introduced Array.init to address this
    issue; however, as we already know, Array.init has some limitations;
    in the following, I propose other initialization functions to overcome these
    limitations, while still providing safety of vector initializations.

    1) Array.init
    -------------

    Array.init is simple to use and convenient for simple initializations
    (when the value to assign to location i can be expressed as a function
    of i). Safety is provided by properties automatically ensured by
    Array.init, such as:

    If v is defined as ``let v = Array.init f n'' then

      1) v is completely initialized:
         for all i, v.(i) has been assigned exactly once during the
         initialization of v
      2) elements of v are values computed by f:
         for all i, v.(i) = f (i)

      In addition, if v is defined with Array.init there is no need to
      perform runtime tests when accessing items inside vector v: v is
      properly initialized.

    2) New initialize function
    --------------------------

    When you need more complex initializations, then you need a smarter
    version of Array.init (still you want to keep the same good properties
    as the simple version).

    For instance, a simple generalization of Array.init, that allows
    access to the locations that are already initialized into the array,
    could be written as:

      exception Uninitialized_access of int;;

      let initialize f n =
       if n = 0 then [||] else
       let get v i j =
        if j < i && j >= 0 then v.(j) else raise (Uninitialized_access j) in
       let v = Array.make n (f (get [||] 0) 0) in
       for i = 1 to n - 1 do
        v.(i) <- f (get v i) i
       done;
       v;;

    Using initialize, we can easily define a vector containing the
    fibonacci numbers:

      let init_fib g = function
        | 0 | 1 -> 1
        | n -> g (n - 1) + g (n - 2);;

      let fib10 = initialize init_fib 10;;
      val fib10 : int array = [|1; 1; 2; 3; 5; 8; 13; 21; 34; 55|]

    Initialize provides the following properties:

    If v is defined as ``let v = Array.initialize f n'' then

      1) v is completely initialized:
         for all i, v.(i) has been assigned exactly once during the
         initialization of v
      2) not yet initialized locations in v are never accessed during the
         initialization process:
         for all i, for all j, if j > i then the location j is not accessed during
         the initialization of v.(i)
      3) elements of v are values computed by f:
         for all i, v.(i) = f (fun i -> (Array.sub v 0 (i - 1)).(i)) (i)

      Once more there is no need to dynamically check accesses to a vector
      initialized unsing the function initialize.

    Evidently, if you want to initialize the other way round, you may
    define the corresponding function, say initialize_down:

      let initialize_down f n =
       if n = 0 then [||] else
       let get v i j =
        if j > i && j < n then v.(j) else raise (Uninitialized_access j) in
       let v = Array.make n (f (get [||] 0) 0) in
       for i = n - 1 to 1 do
        v.(i) <- f (get v i) i
       done;
       v;;

    3) New random_initialize function for ``random'' initialization
    ----------------------------------------------------------------

    Now, you can still argue that you have even more complex
    initialisation schemes: then you need to define a more complex
    initialization function. For instance:

    [random_initialize f i0 n] returns a vector [v] of length [n],
    initialized using the function [f], starting from index [i0]. The
    function [f] takes two arguments, an access function [g : int -> 'a]
    that gives access to already initialized elements of the vector [v],
    and an index [i]; given those arguments [f] returns a pair [next, a],
    where [a] is the value to be stored at index [i] of vector [v], and
    [next] the index of the next element of [v] to initialize. [f] should
    raise the predefined exception [Exit] to mean that index [i] is out of
    range, presumably at the end of initialization.

    The following properties hold:

    If v is defined as ``let v = random_initialize f i0 n'' then

      1) v is completely initialized:
         for all i, v.(i) has been assigned exactly once during the
         initialization of v
      2) not yet initialized locations in v are never accessed during the
         initialization process:
         for all i, for all j, if v.(j) has not yet been assigned then the
         location j is not accessed during the initialization of v.(i)
      3) all elements of v are values computed by calls to f.

    Implementation

      (* Already_initialized is raised when there is an attempt to
         initialize an array location more than once. *)
      exception Already_initialized of int;;

      (* Partial_initialization is raised when there is a location that is
         not initialized at the end of initilization. *)
      exception Partial_initialization of int;;

      let random_initialize f i0 n =
       if n = 0 then [||] else
       let initialized = Array.make n 0 in
       let set v i a =
        if initialized.(i) = 0 then (initialized.(i) <- 1; v.(i) <- a)
        else raise (Already_initialized i) in
       let get v j =
        if j >= 0 && j < n && initialized.(j) = 1 then v.(j)
        else raise (Uninitialized_access j) in
       let nexti0, a0 = f (get [||]) i0 in
       let v = Array.make n a0 in
       set v 0 a0;
       let rec init i =
        let nexti, ai = f (get v) i in
        set v i ai;
        init nexti in
       try init nexti0 with
       | Exit ->
          for i = 0 to n - 1 do
           if initialized.(i) = 0 then raise (Partial_initialization i)
          done;
          Array.init n (fun i -> v.(i));;

    Example: we can define a suitable initialization function for your
    example of combinatorial numbers,

      let init_chooses n get i =
       let next i j = if j <> i then j else n + 1 in
       let ai i = get (i - 1) * (n - i + 1) / i in
       if i = 0 then (next 0 n, 1) else
       if i = n then (next n 1, 1) else
       if i > n then raise Exit else
       if i <= n / 2 then (next i (n - i), ai i)
       else (next i (n - i + 1), get (n - i));;

      let chooses n = random_initialize (init_chooses n) 0 (n + 1);;

      let v4 = chooses 4;;
      val v4 : int array = [|1; 4; 6; 4; 1|]

    4) Further work
    ---------------

    The general vector initialization function should probably take as
    argument f a function that uses 2 functions get and set to handle the
    initialized vector. In any case, the properties 1, 2, and 3 should
    still hold.

    5) Conclusion
    -------------

    I'm not sure we need a complex additional machinery to solve a problem
    that can be elegantly solved with some library functions (admittedly
    at the price of some linear additional cost at initialization time);
    however, following Xavier, I would also be glad to read articles on
    dependant type systems and their applications to vector initialisation
    (if any); also, I would be delighted to try such an experimental type
    system, if it is tractable in practice (I mean, we don't want a type
    system that obliges the programmer to rewrite his pattern matchings in
    a strange and awful maneer, nor a type system that emits error
    messages that are incredibly more difficult to understand than those
    of usual ML typecheckers, nor a type system that is so slow that
    modern machines are turned into good old Apple IIs, don't we ?).

    Best regards,

    -- 
    Pierre Weis
    



    This archive was generated by hypermail 2b29 : Mon May 15 2000 - 23:43:40 MET DST