[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Dawid Toton <d0@w...> |
| Subject: | Portable hash |
I'm looking for a way to calculate hashes of values of variuos types. It has to be: 1. proper function (i.e. x=y => (hash x)=(hash y) ) 2. and should not change with a platform or compiler version. 3. It'd also help to have practically no collisions. For some time I needed not to move data across machines and what I have been (wrongly) using for this so far is: let hash x = Digest.string (Marshal.to_string x []) Recently I realized that it's incorrect as Marshal.to_string is not pure function and doesn't satisfy my first requirement. Indeed in some very rare cases I got different results from the same value. Only the 3rd point is satisfied very well. I have to change my function before I run into problems and I'm considering: 1) Small hashes: let hash x = |Hashtbl.hash_param large_int large_int x Does anybody know what are properies of | |Hashtbl.hash_param? I mean: is the implementation stable? Can I have good distribution when all the data is examined (large parameters)? Unfortunately it returns int of platform-dependent length (and even platform-depentent less significant bits of result?). How hard would it be to tailor it to, say, work always with 31 bits? || 2) To serialize values with Sexp: let hash to_sexp x = Digest.string (string_of_sexp (to_sexp x)) | |This way I have the stablility because I can just keep implementation of Sexp untouched. But the performance is going to be even worse than with my original solution.| Has anybody solved already this (or similar) problem? Dawid Toton