New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak in ocaml runtime with backtraces+threads #7220
Comments
Comment author: @gasche I'm tentatively setting the target for 4.03; we're very late in the release cycle, so if the bugfix turns out to be invasive we may have to delay it for the next release, but I hope there is a simple and safe solution. |
Comment author: bobot
The problem seems more complicated. |
Comment author: robhoes Thanks, I was grepping for I suppose that the |
Comment author: @gasche I looked at the issue a bit as well. I don't know anything about threads, but my understanding is that the code in |
Comment author: @gasche Here is my understanding of what is happening right now. caml_thread_{enter,leave}_blocking_section follow a bracketing discipline where
st_stubs.c also has a "master runtime lock" that protects the global I will call "thread-specific backtrace buffer" the value of the Upon creating a new thread, the thread-specific backtrace buffer is The memory leak happens as the following events happen in order
The memory leak happens between steps (3) and (4): a new backtrace (Possible ideas on how to fix the leak in a separate message.) |
Comment author: @gasche Possible fixes:
|
Comment author: @gasche I just uploaded a patch implementing approach (5), that fixes the leak on the provided reproduction code on my machine. Interestingly, this allows to remove some lines that would save curr_thread->stack_low at the very same point of code for the exact same reason -- save the full thread-local state would have avoided the present bug: https://github.com/ocaml/ocaml/blob/5401ce8/otherlibs/systhreads/st_stubs.c#L462-L466 |
Comment author: @gasche I think that the proposed patch fixes the issue as reported, but I wonder whether there aren't similar bug in other parts of the codebase:
For (1) I think that no change is required: when you unregister a thread from the C code, you must already be inside the blocking section, which means that enter_blocking_section should have been called since the last time OCaml code was running, so the thread-local data should be up-to-date with respect to the global runtime state. For (2) I also think that no change is required. The thread_kill is the one that does cleanup the thread-local backtrace buffer; if the thread being killed is the current thread, it first activates another thread |
Comment author: bobot @gasche as usual well investigated and well explained! Remarks on the patch:
FWIW I think it is small enough for 4.03. |
Comment author: braibant Looking at https://github.com/ocaml/ocaml/blob/5401ce8/otherlibs/systhreads/st_stubs.c#L362 don't you need to perform the cleaning of the other threads there too? That is, why isn't line 372 a call to caml_thread_remove_info? |
Comment author: @gasche bobot: oops, indeed this is a leftover, I'll remove this line, thanks! braibant: my understanding is that this function is only called after fork(), only in the newly forked (child) process. This means that there is no need (and it would be wrong) to reclaim and cleanup here: the threads may still be used in the parent process, and will be cleaned up there. |
Comment author: robhoes Fantastic, thanks for the quick investigation and patch, gasche! This seems to work for me on my example program. This issue turned out to be quite a big deal for XenServer. We are currently using OCaml 4.02.2, and your patch applies there as well. I am currently testing it to see if our memory leaks go away. |
Comment author: @damiendoligez OK for 4.03. |
Comment author: robhoes My testing of today suggests that this patch fixes the leak in the xenopsd daemon of XenServer as well. Thanks again! |
Comment author: @gasche I committed the fix in 4.03: (bobot: in the comment I renamed "stack-local variables" into "runtime state", and accordingly caml_thread_{save,restore}thread_state are now caml_thread{save,restore}_runtime_state) Thinking more about braibant's remark, I'm not sure I understand myself why freeing the thread structure at the beginning of the forked child's life suffices: if the backtrace buffers are copied at fork time, then we should free the thread structure in depth (also free the backtrace buffers reachable from them) instead of shallowly as is done currently. However, I have tried to produce a reproduction case for a leak and I was not able to: the code at the end of this comment will leak without the uploaded patch, but does not leak anymore with the proposed patch. I suspect that the reason why just a shallow free of the thread structure suffices for the child may be the copy-on-write implementation of fork on my system (and, I would assume, most fork-supporting systems): the thread-specific backtrace buffers would only actually be copied if they were modified in either the parent or the child's copy of the memory. This is impossible in the child, as they are discarded at child-birth time; in theory we could have a race where the parent process mutates its own copies of the backtrace buffer during the fork, turning the child's buffers into unreclaimed copy, but I don't know how to reproduce that. Here is the code I tried (assuming that the child's copy of the parent's thread-local backtrace buffers would not be freed) and which does not leak on my system. If someone knows how to find a system where fork() performs a deep copy instead of copy-on-write, it would be interesting to try there. let create_backtrace () = try raise Not_found with Not_found -> () let _ = |
Comment author: junsli Just curious: What does the "st" mean in the interface file? It mentioned "st interface." poSix Threads? Standard Threads? |
Comment author: @gasche I would guess "system threads", by opposition with the green threads implementation where the scheduling is implemented by the OCaml runtime. |
Original bug ID: 7220
Reporter: robhoes
Assigned to: @gasche
Status: closed (set by @xavierleroy on 2017-09-24T15:32:22Z)
Resolution: fixed
Priority: normal
Severity: minor
Platform: amd64
OS: Linux, Mac OS X
Version: 4.02.3
Target version: 4.03.0+dev / +beta1
Fixed in version: 4.03.0+dev / +beta1
Category: runtime system and C interface
Related to: #4702 #5188
Monitored by: lindig braibant @hcarty @dbuenzli
Bug description
An OCaml program with backtrace recording enabled leaks memory each time a new thread is created and completes.
Steps to reproduce
The following example program, compiled with the
-g
option, will very quickly eat up all your memory:let _ =
Printexc.record_backtrace true;
let rec loop () =
let t = Thread.create (fun () ->
try
raise Not_found
with Not_found ->
()
) () in
Thread.join t;
loop ()
in
loop ()
Additional information
I’ve tried this on various compilers (4.02.3, 4.01.0 and 3.12.1), on Linux and Mac, native and bytecode, and see the problem on all of them.
Valgrind reports that the allocation of
caml_backtrace_buffer
in the functioncaml_stash_backtrace
in asmrun/backtrace_prim.c is the culprit (https://github.com/ocaml/ocaml/blob/trunk/asmrun/backtrace_prim.c#L98):if (caml_backtrace_buffer == NULL) {
Assert(caml_backtrace_pos == 0);
caml_backtrace_buffer = malloc(BACKTRACE_BUFFER_SIZE
* sizeof(backtrace_slot));
if (caml_backtrace_buffer == NULL) return;
}
The
caml_backtrace_buffer
variable is a global variable initialised to NULL in byterun/backtrace.c. However, it seems to be local to a thread (how does it become thread local?), is set to NULL whenever a thread is created (where?), and a new malloc happens when an exception is raised. It is not freed when the thread completes.File attachments
The text was updated successfully, but these errors were encountered: