You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Original bug ID: 7537 Reporter:@stedolan Status: resolved (set by @xavierleroy on 2017-10-15T14:35:04Z) Resolution: not a bug Priority: normal Severity: minor Category: runtime system and C interface Monitored by:@gasche@ygrek
Bug description
Hashtbl.hash hashes the contents of closures, including their code pointers:
# Hashtbl.hash (fun x -> x);;
- : int = 710785908
# Hashtbl.hash (fun x -> x);;
- : int = 168456855
Pervasives.compare does not throw an exception on function values which are physically equal:
# let id x = x;;
val id : 'a -> 'a = <fun>
# compare id id;;
- : int = 0
This makes hashtables act strangely:
# let h = Hashtbl.create 20;;
val h : ('_a, '_b) Hashtbl.t = <abstr>
# Hashtbl.add h id 42;;
- : unit = ()
# Hashtbl.find h id;;
- : int = 42
# Hashtbl.find h (fun x -> x);;
Exception: Not_found.
# Hashtbl.add h (fun x -> x) 10;;
- : unit = ()
# Hashtbl.find h id;;
- : int = 42
# Hashtbl.find h (fun x -> x);;
Exception: Not_found.
In particular, the result of Hashtbl.find will depend on whether the optimiser combines the different 'fun x -> x' values. The bytecode toplevel, shown here, does not.
Futher, Hashtbl.find can sometimes raise Invalid_argument, again at the mercy of the optimiser:
# let add a b = a + b;;
val add : int -> int -> int = <fun>
# let f1 = add 1;;
val f1 : int -> int = <fun>
# let f2 = add 1;;
val f2 : int -> int = <fun>
# Hashtbl.hash f1;;
- : int = 881405119
# Hashtbl.hash f2;;
- : int = 881405119
# Hashtbl.add h f1 10;;
- : unit = ()
# Hashtbl.find h f2;;
Exception: Invalid_argument "compare: functional value".
The text was updated successfully, but these errors were encountered:
I think that Hashtbl.hash should return the same hash for all functions, without inspecting them (so as to ignore those parts of a complex data structure which are closures when computing a hash), and that Pervasives.compare should always raise when asked to compare functions, even if the functions are physically equal.
In the example above, this would cause all calls to Hashtbl.find to raise Invalid_argument.
(This issue was reported by Nicolas Ojeda Bar. Also, this comment should be part of the issue, but I don't know how to edit issues)
I don't have the same conclusion. I agree that compare should always fail, but it is perfectly ok for me for hash to behave that way. The only sensible equality on functions is physical equality, and hash f is equal to hash g when f == g.
Having hash behaving the way you suggest might turn linear code into quadratic one.
The correct way to do what you suggest is
module M = struct
type t = int -> int
let equal = (==)
let hash = Hashtbl.hash
end
module T = Hashtbl.Make(M)
Pervasives.compare should always raise when asked to compare functions, even if the functions are physically equal
This is the "NaN comparison problem" all over again. The functions may be deep inside a data structure. With your requirements it becomes impossible to ever shortcut comparisons ("if x == y then compare x y = 0"). We've already sacrificed shortcutting "=" tests in the name of NaNs; we're not going to sacrifice Pervasives.compare tests in the name of function closures.
The sensible thing to do is the custom hash table shown in @chambart's reply.
I move to resolve/no change this PR as I don't see anything we could do here.
Original bug ID: 7537
Reporter: @stedolan
Status: resolved (set by @xavierleroy on 2017-10-15T14:35:04Z)
Resolution: not a bug
Priority: normal
Severity: minor
Category: runtime system and C interface
Monitored by: @gasche @ygrek
Bug description
Hashtbl.hash hashes the contents of closures, including their code pointers:
Pervasives.compare does not throw an exception on function values which are physically equal:
This makes hashtables act strangely:
In particular, the result of Hashtbl.find will depend on whether the optimiser combines the different 'fun x -> x' values. The bytecode toplevel, shown here, does not.
Futher, Hashtbl.find can sometimes raise Invalid_argument, again at the mercy of the optimiser:
The text was updated successfully, but these errors were encountered: