Version française
Home     About     Download     Resources     Contact us    
Browse thread
How to do this properly with OCaml?
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Thomas Fischbacher <Thomas.Fischbacher@P...>
Subject: How to do this properly with OCaml?

I repeatedly stumble across problems where I would like to use a 
programming style which seems quite out of line with how one is supposed 
to use OCaml (to say it otherwise, it looks outright wrong and 
horrendously ugly), but which nevertheless might be just appropriate for 
the situation in question. So, I would like to hear the opinion of the 
OCaml community about this.

Imagine constructing a binary heap of tuples. It is somewhat neat to be 
able to just use an array to store the entries of the heap which is 
pre-allocated to some fixed size and replaced by a larger array whenever 
more space is needed. The only thing known about the heap's entries at 
compile time is that they are bound to be tuples, hence boxed objects,
but nothing else is known about their size or even type.

As one does not have a prototype of such a tuple at the time the array is 
created, it seems to me as if the best thing one could do in OCaml would 
be:

(1) Make an array of <tuple> option and initially fill it with None.

(2) Make an optional array of tuples which is None until the first entry 
is made.

One drawback of approach (2) is that when we "remove" entries from the 
heap, the underlying array will keep unnecessary references to values 
which by chance might be quite big, and which we might want to be 
reclaimed by GC, so that's not too beautiful for a general-purpose 
application.

The major drawback of (1), however, is that there is a further level of 
indirection for array entries, which means some unnecessary consing, as 
well as more work for the machine to get at a given entry.

If I just define a function

let whatever () = Obj.magic (1,2);

then 

let a = Array.make 10 (whatever());;

would give me more or less just what I want. An array of boxed things that 
I then would like to use as in:

# let () = a.(2) <- (1,2,3,4) in a.(2);;
- : int * int * int * int = (1, 2, 3, 4)
# let () = a.(3) <- (5,8,2,1) in a.(2);;
- : int * int * int * int = (1, 2, 3, 4)
# a.(3);;
- : int * int * int * int = (5, 8, 2, 1)

Mind you - in this case, I will only assign int*int*int*int tuples to that 
array, or the result of the evaluation of whatever() when I want to kill 
an entry. Plus, I guarantee never to look at any entry that is set to 
whatever().

In some other situation, I might be inclined to use the same code, but 
with string*bool instead of int*int*int*int. But again, I will promise 
never to put anything else but string*bool into the underlying array, and 
never look at entries which are not set properly. (Which, of course, 
includes never printing such an array at the toplevel unless it is fully 
occupied.)

Please don't tell me that "well, OCaml is not Lisp, after all". 
This I already know. But is there no proper way to get around that 
(technically speaking) unnecessary extra level of indirection that is 
forced upon me due to static type checking?

The very same problem appears in a different guise when one tries to write 
a function that flattens 'a array array -> 'a array. The best I could 
come up with here is:

let join_arrays aa =
  let len_total = Array.fold_left (fun sf a -> sf+Array.length a) 0 aa 
  and nr_arrays = Array.length aa
  in
  if len_total=0
  then
    if nr_arrays = 0 then [||] else aa.(0)
      (* ^ Take a close look! *)
  else
    (* Here, we face the problem that we have to get some value
       of the proper type to init the joined array with.
     *)
    let first_entry =
      (let rec seek pos =
	let array_here = aa.(pos) in
	if Array.length array_here = 0
	then seek (pos+1)
	else array_here.(0)
	    (* This is guaranteed to terminate, as len_total>0! *)
      in seek 0)
    in
    let result = Array.make len_total first_entry in
    let transfer_to_result arr offset =
      let nr = Array.length arr in
      for i = 0 to nr-1 do result.(i+offset) <- arr.(i) done
    in
    let rec populate nr_array_now offset =
      if nr_array_now = nr_arrays
      then result
      else 
	let array_now = aa.(nr_array_now) in
	let () = transfer_to_result array_now offset in
	populate (1+nr_array_now) (offset+Array.length array_now)
    in
    populate 0 0
;;

Of course, it is pretty awkward having to scan for the first element in 
the first nonempty array in an array of arrays, especially as that element 
really does not play a that special role.

Could anyone please tell me how to do all this in a more appropriate way?

-- 
regards,               tf@cip.physik.uni-muenchen.de              (o_
 Thomas Fischbacher -  http://www.cip.physik.uni-muenchen.de/~tf  //\
(lambda (n) ((lambda (p q r) (p p q r)) (lambda (g x y)           V_/_
(if (= x 0) y (g g (- x 1) (* x y)))) n 1))                  (Debian GNU)