Version française
Home     About     Download     Resources     Contact us    
Browse thread
modifying hash tables
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: David Allsopp <dra-news@m...>
Subject: RE: [Caml-list] modifying hash tables
> I want to modify the _current_ datum in Hashtable.iter, e.g., increment
> a count. Is this code valid?
> Hashtable.iter ht ~f:(fun ~key ~data ->
>   Hashtable.replace ht ~key ~data:(data+1))
> (it appears to work, but I was told that "it is not guaranteed to work",
> i.e., the documentation does not promise that).
> If the above code is wrong, what is TRT?

You're referring to module "Hashtable" - where's this from?

The potential problem is that you're changing the hash table while it's
being "read" - it's a matter of "definition" on what that sort of use means.
For example,

let t = Hashtbl.create 32
in
  Hashtbl.add t 1 "a"; Hashtbl.add t 2 "b"; Hashtbl.add t 3 "c";
  (* As it happens, these will print in order... *)
  Hashtbl.iter (fun k v -> Printf.printf "%d: %s\n" k v) t;
  (* Now let's mess around... *)
  let f k v =
    Printf.printf "%d: %s\n" k v;
    if k = 2
    then Hashtbl.remove t 3
  in
    Hashtbl.iter f t;;

prints three lines resulting from the initial iter but only two lines
resulting from the second iter yet the documentation for Hashtbl.iter states
"Hashtbl.iter f tbl applies f to all bindings in table tbl" - the definition
of "all bindings" is left somewhat fuzzy (probably on purpose!). Just to
make things weirder - if you re-run the code above but change the 32 to 3 in
the first line then all three bindings are printed twice because of the
different ordering of the elements in the hash table! You can get some even
"stranger" (by which I mean unpredictable) results if you come up with a
function that adds elements and so causes the table to be resized mid-way
through the iter...

	In your code, you're only replacing the data of existing bindings so
you're likely to be fine using iter (at least in the current implementation)
as the keys won't change - it's probably worthy of a big safety comment in
your code, though. "Safer" solutions include ExtLib's Hashtbl.ExtHashtbl.map
function or the Map module. Both of these have performance hits that may be
of concern - maps are on average slower than hash tables in O'Caml and
Hashtbl.map copies the table each time (though it's not clear to me why it
bothers doing that - it wouldn't take much to alter it to do an in-place
map).

	The "problem" as such is applying functional idioms such as fold and
iter to an inherently imperative data structure...


David