Version française
Home     About     Download     Resources     Contact us    
Browse thread
Alternative generic hash function
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Francois Pottier <Francois.Pottier@i...>
Subject: Re: Alternative generic hash function

Hello,

Manuel Fähndrich wrote:

> I'd rather have a generic hash function (Hashtbl.hash_pure) that never
> looks into mutable structures

Currently, mutable and immutable structures cannot be distinguished at
runtime. This would require maintaining additional type information,
and would probably introduce a lot of overhead.

> I frequently use hashing with equality being ==. Whenever I do this, I
> need to actually add some unique identifier to the data structure just
> for the hashing operation. Having an alternate hash function
> (Hashtbl.hash_pure) would make such id's (and their managment)
> unnecessary.

This unique identifier trick is a common one, and, unfortunately, it
appears to be necessary. How would you expect this alternate hash
function to be implemented? Clearly, it will also have to add unique
tags to every value. (A value's address cannot be used as a hash code,
since it may change during garbage collection. Physical addresses can
only be compared for (dis)equality, but their numeric value is
meaningless.) Better let the programmer do this on a case-by-case basis
than add this overhead to every value creation.

Basile Starynkevitch wrote:

> Unfortunately I cannot make an hashtable using hash_symbol as hashing
> function

You can, but you have to have to choose a fixed (monomorphic) value
for the 'a parameter, e.g.

  module M = Hashtbl.Make(struct
        type t = int symbol_t
        let equal x1 x2 = x1.sym_name = x2.sym_name
        let hash = hash_symbol
      end);;

'a cannot remain unspecified, otherwise you would be allowed to create
a table with heterogeneous keys.

In fact, in your particular example, I don't see why you would want
to use values of type symbol_t as keys: obviously, you are only using
the symbol's name as a key. Why not use a hash table whose keys are
symbol names, and store the sym_val information someplace else?

-- 
François Pottier
Francois.Pottier@inria.fr
http://pauillac.inria.fr/~fpottier/