New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add caml_weak_get(), caml_weak_set() and caml_weak_check() to caml/*.h #7173
Comments
Comment author: @mshinwell If 4.03 has changed the memory layout of the weak blocks, this should be documented somewhere. Is it? |
Comment author: bobot I think it is not. Since weak.h is not in the public include and nthing is said about the layout of weak blocks in the documentation I though nobody used weak pointer from C. In 4.03 weak.h is still not public but define the following constant:
This interface is too basic, and really not for the public, but I think we don't want either to give access to In 4.03 a bug (it is in the changes) when two weak arrays contains the same value have been fixed. It make To sum-up, I think weak have never been given a real C API. Still something should be added to the changes at least for politeness about the change in layout and access. Since now accessing weak pointers (not due to ephemerons but due to the fix for weak) is a little more complicated, people should not try to access them themselves. So the question, should we try to give a C interface for weak/ephemerons this late for 4.03 since people are using weak from C? Or could we wait for 4.04? In anycase I'm going to try to make a PR for a tentative C API. |
Comment author: @mshinwell I would suggest that for 4.03 we just document the change. |
Comment author: @stedolan I would prefer to avoid documenting the memory layout for weak pointers, since it's going to change again in multicore. A C API to access weak pointers sounds like a good idea, though. |
Comment author: bobot Oh, really? I understood there was nothing problematic with weak pointers for multicore. Though I added for the fix for weak pointers a new phase |
Comment author: goswin I'm not sure how everybody else has handled dynamic C structures with closures in harmony with the Gc but Weak.t is the only way I see to make that work so memory isn't leaked. So some form of interface is needed. A non-allocating one would be preferable. Should be save to assume nobody stores "Some nullptr" in a Weak.t. While you change Weak.t it would be real nice if the Weak.t still contained the value while the finaliser for that value runs. After all the code is allowed to store the value somewhere to make it alive again. Makes no sense to me that the Weak.t is (temporarily?) empty when the object then remains alive in the end. |
Comment author: @mshinwell I don't think this needs to be classified as urgent, although it needs addressing. |
Comment author: bobot
If it was the case, it is not the case anymore in 4.03, the addition of the clean phase makes the semantic of weak pointer more straightforward. |
Comment author: goswin I marked it as urgent because 4.03 will break code where people have figured out how to access a Weak.t like me. Just like they figured out how to access a 'a option or a reference. |
Comment author: bobot
I disagree, it is not the same thing. In 4.02 ocaml manual chapter "19.3 Representation of OCaml data types", the layouts of structure and algebraic datatypes are described. It is not the case for Weak.t. |
Comment author: @gasche I think that's a rather fine line -- I mean I think that both bobot and goswin are right in this case. Whether or not it was reasonable to exploit Weak.t's representation in the first place (bobot is right that it was not), the idea of proposing a better interface at the same time that the representation changes is good. François, would you be able to write a library implementing this interface without too much work? If so, then it becomes a question of whether we have the "change budget" to have it in 4.03. My gut feeling is that we haven't (it's too late), but couldn't we try to distribute the library as a separate library during the 4.03 release? Is it easy for people to rely on C libraries distributed through ocamlfind? If so, I think it's worth trying to do it. |
Comment author: bobot Gasche, it is a nice idea to target the library for 4.04 and when it is merged in trunk propose a C library through opam that implement the API. I started, the first part is to rename |
Comment author: @gasche My idea was to propose the interface as an external library that would be available for <4.03 versions (legacy interface), =4.03 version (new interface), and >4.03 (shim library if this becomes available upstream, or nothing at all if .h include paths work this way). But you know better about this stuff. |
Comment author: @jhjourdan I really don't understand why this would be impossible to put this library in the 4.03 release. This is really no more than a few dozen LOC, all the technical meat is already written for caml_weak_set/get... |
Comment author: @gasche Well, you should just ask Damien? Also, if either you or François could actually write this dozen of lines, it would be a good basis for further discussion. (If you wait for a decision it may be too late.) |
Comment author: @jhjourdan Cf. #509 I did export only simple getters and setters. We could also export copy getters, creation function, etc... |
Comment author: bobot The longest is to create tests for this API. Here a first try:
I also perhaps found a bug concerning the get_data of an ephemeron but I have not yet created a reproduction test. |
Comment author: @jhjourdan Your attempt is cleaner than mine. Perhaps a few comments:
|
Comment author: bobot
If you can write tests for my attempt in testsuite, at the same time I will take into account your comments, write changelog, ... And then the discussion for 4.03 or 4.04 could happen. |
Comment author: @jhjourdan I'm working on the tests. Concerning the API: I am not really happy with the fact that the functions return Val_unit if the cell is empty. What if the cell does contain Val_unit?
|
Comment author: bobot
|
Comment author: @jhjourdan
Done: bobot#3 I'm working on factorizing the code between C and OCaml functions. |
Comment author: bobot
We should not do the same work in parallel :). I said I'm doing it. I though you said you were working on the tests. |
Comment author: @jhjourdan
Sure. It's just that some functions are missing (e.g., ephemeron_(un)set_data), and others are clearly wrong (e.g., ephemeron_set_key), so that was not easy to make my tests work ;), so I prefered working on the code. |
Comment author: goswin Returning Val_unit == Val_int(0) if unset is clearly not going to work since users can easily set that. I can easily see someone using 'type t = Foo | Bar | Baz of double | Bla of int and t Weak.t'. And then you fail after Weak.set (Some Foo). What was wrong with the idea of returning a nullptr in the C interface if unset? What is this about the key having to be a registered root? value res; caml_weak_array_get(weak, index, &res); should work, or at least CAMLlocal(res); caml_weak_array_get(weak, index, &res); |
Comment author: bobot
I want the API to return valid values, otherwise you can't put it without care in a value registered with CAMLparam* or CAMLlocal* (see the warning of section 19.7.2 "Applying OCaml closures from C")
The documentation of the module Weak states that you can't put inside not allocated values. I will add failwith and asserts for that case in weak.c (I think they are already some) because
|
Comment author: @jhjourdan
That's it : CAMLlocal registers a local root. |
Comment author: @jhjourdan I have done the tests (basically copying ephetest, with one additional test to improve coverage): |
Comment author: @jhjourdan
Damien said me it seems fine. |
Comment author: bobot GPR at #515 |
Comment author: goswin
I think it's enough if it is reworded. "registered root" sounds like you have to caml_regsiert_root() the value. Just state which functions may/will allocate or a blanked all of them may allocate and then the usual rules for living in harmony with the Gc apply.
|
Comment author: bobot
Good idea, I will add this information.
|
Comment author: @jhjourdan
Except that guessing whether something is going to be allocation dynamically or pre-allocated by the compiler requires knowing the implementation details of the compilers optimizations. |
Comment author: bobot
The question was not between allocated and preallocated values, I think. Even if it is indeed true that preallocated values will be forever alive ("needed"), it is not the same problem than with integers. The number of such value is statically fixed so no problem of unbounded growth. |
Comment author: goswin You might have a point for weak hashset and weak hashtbl, BUT
|
Comment author: @jhjourdan
I think this is not a good idea. You want to be able to use Weak arrays and Ephemeron without worrying about the actual in-memory representation of keys and data. As for the behavior: either we consider non-allocated values as always alive or we consider them as always dead. I'd rather take a conservative choice by making them dead (i.e., writing an int in a weak cell immediately empties the cell -> we do not get out of memory), but since this is not the historical behavior, let's stay as we are. As goswin said, using int as a key for an ephemeron is silly, and constant constructors are in bounded numbers, so they represent the same danger as statically allocated values. |
Comment author: bobot
What about using Z.t of Zarith as key? Is it silly?
I agree that it is intellectually more satisfying to be able to use with any types a polymorphic types. But I don't understand how somebody could use Weak arrays and Ephemeron without worrying about the actual in-memory representation of the keys when it could mean complete opposite semantics. (for data I agree we should accept any value)
In which way is it? The documentation forbid it, and I think they are/were in some place of the runtime asserts that check that the value is allocated. Moreover I'm not sure if in all the algorithms the case not Block have been though. If one force me to choose between alive or dead, I will choose dead because it is indeed conservative, easier to implement (you don't even have to put the value in the key, since you can remove it immediately). In fact it means turning an actual exception into a silent bug. |
Comment author: @jhjourdan
Right, but are you sure that nobody exploits this behavior ?
Oh really? |
Comment author: goswin You can't just delete it. That would break expectatzions. Consider a function that both caches computations and uses them for a final answere: let do_with x y z = Now the code suddenly has to worry that x might be removed from w despite clearly being still alive. Conservative is to assume non-allocated things are always alive. You don't know when they go out of scope. |
Comment author: bobot I don't understand your example; Expectations are not broken because that work the same for allocated values:
Nothing tells you that your are not in the |
Comment author: goswin sorry, bad example. I wanted to write: let do_with x y z = This will always work if z is allocated but with the above suggestion will suddenly fail if z is not allocated while z is clearly still alive. |
Comment author: bobot This example is sound. But I think there is similar example in the other way around that break expectations. If we take the suggestion that not-allocated values are always alive: let do_with x y fresh = This will always work if z is allocated but with the above suggestion will suddenly fail if z is not allocated while z is clearly dead. That's why I don't like to define the aliveness of non-allocated values. |
Comment author: goswin I found it surprising that z is actually dead there. All that code is still inside the scope of "let z = ...". The Gc actually frees z before it goes out of scope because it isn't used again. So one can make a case either way. But given that the dead case requires the use of an explicit Gc.major call to consistently fail could we consider the alive case the more realistic one and make Weak.t behave that way? Keep a note that non-allocated values are bad because the aliveness can't be determined and that they are never removed from a Weak.t. |
Comment author: goswin While we are at it. What is the behaviour of Weak.t with C pointers? Or generally values allocates outside the ocaml heap? Does the removal of totaly naked pointers change things? |
Comment author: @mshinwell @bobot Could you reply to goswin about his question (which is from March...)? |
Comment author: bobot
I think that we are not going to agree on the behavior for non-allocated values, but since that was not the problem stated initially, I think it can be closed. |
Comment author: @mshinwell Superceded by #515 |
Original bug ID: 7173
Reporter: goswin
Status: closed (set by @mshinwell on 2016-12-08T13:31:18Z)
Resolution: duplicate
Priority: normal
Severity: feature
Version: 4.02.3
Category: runtime system and C interface
Monitored by: @gasche bobot @hcarty
Bug description
At the moment it is impossible to access a Weak.t from a C stub in a proper manner because the memory layout of Weak.t is undocumented and there are not accessor functions being exported. Current behaviour even departs from what is documented in the wild.
Complicating things it seems that with 4.03 the memory layout of Weak.t changes, moving all items down one index to make room for the ephemeral field or something. Every C stub using a Weak.t will have to be fixed. Now would be the right time to provide accessor functions to people can access Weak.t properly in the future.
The text was updated successfully, but these errors were encountered: