Version française
Home     About     Download     Resources     Contact us    
Browse thread
Bigarrays and temporar C pointers
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: John Prevost <j.prevost@g...>
Subject: Re: [Caml-list] Bigarrays and temporar C pointers
Well, assuming you really need to work in this strange way, I have a
couple of thoughts how to do it.  Note that it's going to be rather
unsound to work with this in any case, but at least you will get
exceptions instead of core dumps or worse.  The heart of the matter is
that you should *not* allow the user to manipulate the data array
directly.

module Scary_map_thingy_1 =
 (struct
    exception Scary_map_unmapped
    exception Scary_map_conflict
    type ('a, 'b) t = ('a, 'b, c_layout) Bigarray.Array1.t ref
    let dim a = match a with None -> raise Scary_map_unmapped
                           | Some a' -> Array1.dim a'
    (* replicate other functionality from Array1 below *)

    let already_mapped = ref false
    let map k f =
      if !already_mapped then raise Scary_map_conflict else
      let a = ref (Some (_map_ptr k)) in
        try
          already_mapped := true;
          f a;
          _unmap_ptr k;
          a := None;
          already_mapped := false
        with exn -> begin
          _unmap_ptr k;
          a := None;
          already_mapped := false;
          raise exn
        end
  end : sig
    type ('a, 'b) t
    exception Scary_map_unmapped (* It escaped scope *)
    exception Scary_map_conflict (* Tried to map inside map *)
    val dim : ('a, 'b) t -> int
    (* rest of replicated API *)
    val map : ('a, 'b) kind -> (('a, 'b) t -> unit) -> unit
  end)

So the approach here is to wrap the value in such a way that it
doesn't matter if it escape the scope.  This is not really any better
than what you have now.  You still have to warn the user *not* to
allow it to escape scope, since it won't work.  But the benefit is
that it is guaranteed to fail if the user tries it.

The second approach is to prevent the user from accessing the data directly:

module Scary_map_thingy_2 =
 (struct
    exception Scary_map_conflict
    let already_mapped = ref false
    let map k f =
      if !already_mapped then raise Scary_map_conflict else
      try
        let a = _map_ptr k in begin
          already_mapped := true;
          for i = 0 to Array1.dim a do
            f a.{i}
          done;
          _unmap_ptr a;
          already_mapped := false
        end
      with exn -> begin
        _unmap_ptr k;
        already_mapped := false;
        raise exn
      end
  end : sig
    exception Scary_map_conflict (* Tried to map inside map *)
    val map : ('a, 'b) kind -> (int -> 'a -> unit) -> unit
  end)

In this second case, the approach is to prevent the caller from ever
getting a handle on the actual data array.  Instead, the map is made,
the caller is handed every (index, value) pair from the array in turn,
and then the map is unmade.  This is much more restrictive, but also
much safer.

Finally, this kind of approach might be best if mapping and unmapping
is not particularly expensive, and you trust the user to act better:

module Scary_map_thing =
 (struct
    val data = (ref None : (some, specific, c_layout) Array1.t ref)
    val hold_count = ref 0
    let hold () =
      begin
        incr hold_count;
        match !data with
          | Some _ -> ()
          | None -> data := _map_ptr some_specific_kind
        end
    let unhold () =
      begin
        decr holding;
        match !holding with
          | 0 -> _unmap_ptr some_specific_kind
          | _ -> ()
      end
    let work f =
      begin
        hold ();
        try
          let result = f () in
          unhold ();
          result
        with exn -> (unhold (); raise exn)
      end
    let dim () = work (fun () -> Array1.dim !data)
    (* rest of modified Array1 calls here *)
  end : sig
    val work : (unit -> 'a) -> 'a
    val dim : unit -> int
    (* rest of modified Array1 calls *)
  end)

In this last approach, instead of wrapping that array up in a data
structure, we wrap it up in a module.  The module either has a
currently mapped copy of the data, or it doesn't.  If you call
Scary_map_thing.dim (), you get the dimensions of the data, no matter
what.  If the data was unmapped when you called dim, it is mapped, the
value is gotten, then it is unmapped.  If you have a *lot* of work to
do and wish to avoid mapping and unmapping constantly, you can wrap
your function up in work like this:

  let myfunc () =
    Scary_map_thing.work (fun () ->
      for i = 0 to Scary_map_thing.dim () do
        Scary_map_thing.set i (Scary_map_thing.get i + 1)
      done)

Which will map it once, then use it a lot, then unmap it at the end. 
This version also doesn't throw up if you call a function that tries
to map while mapped--it just increments a counter.

This third solution may be the best one over all--especially because
there are a number of ways it can be improved.  For example:

  * If you need to be able to map as multiple different kinds, then
    you can provide more useful state in the code, so that you track
    what kind it is currently mapped as, and adjust the mapping as
    needed in order to work safely.  In this case, Scary_map_thing.dim
    would have the type ('a, 'b) kind -> int, and likewise all of the
    modified Array1 calls would take kinds instead of unit or actual
    arrays.

  * If you want to avoid mapping and unmapping in a more general way, and
    use threads, you could start up a worker thread in the module that
    keeps track of the last time the data was used, and unmaps it if
    it hasn't been used in a certain amount of time.

Finally, please note that none of the skeletal solutions I describe
above are thread-safe.  If more than one thread can be working at a
time (like with the unmap-on-timeout extension), you need to be more
careful about modifying the internal state.  Note that I think
solution 3 is the only one that can cleanly handle threads at all,
since it's the only one that can handle multiple people wanting to
work with the data all at once.

Hope these ideas were useful,

John.