Version française
Home     About     Download     Resources     Contact us    
Browse thread
Hashtbls with physical equality?
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Marcin 'Qrczak' Kowalczyk <qrczak@k...>
Subject: Re: [Caml-list] Hashtbls with physical equality?
Wheeler Ruml <ruml@parc.com> writes:

> OK, fair enough.  I guess there isn't any static "address-like" identity to
> objects in OCaml, so I really do have to examine at least a few bits of the
> object to compute a hash value.

Indeed.

Serialization uses an internal hash table hashed by addresses. It can
do this because it's implemented in C and performs no GC during
serialization, so address don't change. It would be impossible in pure
OCaml.

For my language Kogut I borrowed the idea of stable names from Glasgow
Haskell, strengthening its guarantees a bit (GHC does not guarantee
uniqueness of stable names, because Haskell is lazy and does not have
a well-defined concept of object identity which does not change during
evaluation).

Here they are called object ids. An object id is a hashable value
representing the identity of an arbitrary object, such that a given
object each time yields the same object id.

An object id doesn't keep the original object alive, can't be used
to access it, and is not kept alive by the object either. If an
object id is garbage collected, the new object id of the same object
will not necessarily have the same hash value.

This means that an object id must be kept alive during the time when
the corresponding object is used in a hash table, and thus it cannot
be used to obtain a universal "object -> hash" function (which does
not use extra memory for the duration of the life of the object,
otherwise it's possible with the help of weak references).

Fortunately the design of my hash tables makes such universal hashing
function unnecessary. If you want to use a different equality
predicate, you don't specify the predicate with a corresponding hash
function. Instead you specify a function which transforms keys into
something which is comparable and hashable using standard generic
functions for equality and hashing. Transformed keys are stored inside
the hash table and used for lookup.

So you can make a case-insensitive dictionary of strings by supplying
a transformation function which folds case, or a dictionary based on
object identity by supplying the ObjectId function.

The implementation of object id uses an internal hash table based on
physical addresses, which is rehashed on each GC (actually two tables,
one per generation).

-- 
   __("<         Marcin Kowalczyk
   \__/       qrczak@knm.org.pl
    ^^     http://qrnik.knm.org.pl/~qrczak/