New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom blocks are finalized more than once if stored in OCaml's weak sets (probably related to Weak.get_copy) #7279
Comments
Comment author: bobot The fact that Weak.Make uses Weak.get_copy is certainly the culprit. For the fact that 0Caml 4.03.0+beta1 didn't have the bug but 4.03.0 does, is not really determining since 4.03.0+beta1 add a big bug during weak hashset resizing 8204f5d. |
Comment author: @alainfrisch Copying custom blocks sounds rather scary indeed. What about changing Weak.get_copy implementation (and documentation) so that it returns the original value if it is a custom block. Is there a better solution? |
Comment author: sawfish If Weak.get_copy just returned the original value if it is a custom block would make it rather useless in my opinion, as this makes the value live. I am working on some OCaml bindings for an SMT solver and use a weak hashtable to lookup the solver's native terms which are stored in a custom block. This implements a restricted hash-consing that avoids having multiple alive custom blocks around that refer to the same native term. My experiments suggest that Weak.get_copy works correctly (also when storing custom blocks) in OCaml versions prior to 4.02.0. So I still hope that there is a bug in handling the weak copies that could be fixed. |
Comment author: @alainfrisch
I think it is just that until recently, custom blocks with finalizers where normally always allocated directly in the major heap, and (to be confirmed), get_copy would instead allocate them in the minor heap, and their finalizer would not be run. Could that be the case? Of course, this is not a proper solution (and it's actually very unsafe).
It makes the values that fall in the same slot live for a bit longer. Whether it leads to increased memory use or not depends on the allocation profile of your application (and the size of the weak table, and the hashing function, etc), so unless you have tried it, I would be cautious before saying it renders the Weak table "useless".
This property would not be lost, of course. |
Comment author: bobot
By reading the code, it seems that only custom block allocated with |
Comment author: @alainfrisch
Ok, do we all agree that this behavior was already wrong? get_copy can happily create a fresh copy of a custom block, which the client code can keep around while the original custom block is finalized perhaps releasing associated native resources. But the client code can still use the copy later and try to access those resources. get_copy on custom block is really too risky: it breaks the invariant that custom blocks are only allocated explicitly by the user (or through the unmarshaling function using a function provided by the user). |
Comment author: @alainfrisch POC: open Bigarray open Bigarray.Array1 let () = let a = create float64 c_layout 10 in Gc.compact (); set a 0 42.; let w = Weak.create 1 in Weak.set w 0 (Some a); let b = match Weak.get_copy w 0 with | None -> assert false | Some b -> b in Printf.printf "a.(0) = %f\n" (get a 0); Printf.printf "b.(0) = %f\n" (get b 0); Gc.compact (); assert(Weak.get_copy w 0 = None); let c = create float64 c_layout 10 in Printf.printf "b.(0) = %f\n" (get b 0); set c 0 123.; Printf.printf "b.(0) = %f\n" (get b 0); prints on my machine: a.(0) = 42.000000 b.(0) = 42.000000 b.(0) = 0.000000 b.(0) = 123.000000 The array malloced for [c] reuses the space for [a] (the OCaml value has been collected, thus freeing the underlying data). This can easily lead to segfaults. |
Comment author: bobot
I agree. I'm going to create a merge request for removing that. But should we deprecate sawfish: The problem appears already before 4.02.0, take your gcbug.ml and replace
which is a more efficient version of the code. It fails with 4.01.0 |
Comment author: @alainfrisch I'd argue to keep get_copy but change its implementation and doc so that it returns the original value when it is a custom block. This would only affect people doing weak tables on custom blocks, which should be rather rare (and more or less already broken anyway). As for using get instead of get_copy in Weak.Make, this is another question. As I reported on caml-list, LexiFi now has a local variant of Weak.Make that uses get instead of get_copy; we use it for our core contract algebra system which is performance critical and this seemed to give good results. There can of course be cases where this would increase the memory usage too much; this depends on so many factors that it is difficult to give a general recommendation. For us it worked well. |
Comment author: sawfish @Frisch: The term "useless" is too harsh, of course. After spending some more thoughts, I agree with Alain. Two observations:
These two observations make me believe that Weak.get_copy could be avoided entirely without impacting performance. But as there is no strong evidence for that claim (yet), I also suggest to keep Weak.get_copy and to modify it such that it returns the actual value in case that value is a custom block. |
Comment author: @alainfrisch Thanks @sawfish. This also confirms that one should perhaps let users choose whether get or get_copy should be used in Weak.Make. |
Comment author: bobot #710 for "don't copy custom block". Alain your POC is integrated in the testsuite. |
Comment author: bobot I created #7303 for discussing a full (utopic?) solution. |
Comment author: @alainfrisch Bumped target version. This is too late for 4.04 (and it's not a regression). |
Comment author: @xavierleroy |
Comment author: @damiendoligez Fixed by #710 |
Original bug ID: 7279
Reporter: sawfish
Status: resolved (set by @damiendoligez on 2017-02-23T15:56:30Z)
Resolution: fixed
Priority: normal
Severity: major
Version: 4.03.0
Target version: 4.05.0 +dev/beta1/beta2/beta3/rc1
Fixed in version: 4.05.0 +dev/beta1/beta2/beta3/rc1
Category: standard library
Monitored by: braibant @gasche
Bug description
If custom blocks are stored in weak sets obtained by Weak.Make, their finalizers are sometimes called more than once.
There has been a discussion on the OCaml mailing list regarding this issue:
https://sympa.inria.fr/sympa/arc/caml-list/2016-06/msg00070.html
As a bottom line, the assumption was that the behavior may be related to shallow copies that are created by Weak.get_copy during lookup operations.
The bug is reproducible since OCaml 4.02.0 but does not occur in earlier versions. As an exception, the first beta release of OCaml 4.03.0 does not exhibit the bug, whereas it reproduces in the final 4.03.0 release.
Steps to reproduce
A github repository with a minimal example that reproduces the error is available at https://github.com/martin-neuhaeusser/ocaml_bug/
The text was updated successfully, but these errors were encountered: