New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition in caml_get_raw_backtrace #6554
Comments
Comment author: @diml The code changed in 4.02 but the bug is still there: the rest of the array contains NULL pointers instead of garbage. |
Comment author: @mshinwell Fix committed to the 4.02 branch, rev. 15210; and trunk, rev. 15211. |
Comment author: @lefessan I think you were a bit fast to commit the fix. Would it be possible to give us a little time to review the fix before committing it ? In particular, I don't understand why you are not saving the backtrace in a "malloced" space, so that it won't change during the "caml_alloc" ? |
Comment author: @mshinwell Well, two people have carefully reviewed the fix, and we used a test case to prove with reasonable certainty that it is fixed... I don't understand your comment about the malloced space. The backtrace is saved on the stack across the allocation. |
Comment author: @lefessan Sorry, I read your comment and Jeremy's one in the wrong order (i.e. I understood that your commit didn't fix the problem), and looking only at the last sentence of your bug report, not the code itself. Since you are saving the data on the stack, I assume that this function is not called from the top of the stack when a Stack_overflow is raised, but after the stack has already been unwinded, just to be sure ? |
Comment author: @mshinwell I believe we only end up in this function if the user asks for the backtrace; in particular, this isn't the function that stashes the backtrace. As such I think we should be ok if stack space is tight when the exception actually occurs. |
Original bug ID: 6554
Reporter: @diml
Assigned to: @mshinwell
Status: closed (set by @xavierleroy on 2016-12-07T10:34:42Z)
Resolution: fixed
Priority: normal
Severity: major
Version: 4.02.0
Target version: 4.02.1+dev
Fixed in version: 4.02.1+dev
Category: runtime system and C interface
Monitored by: @gasche @yakobowski
Bug description
We were getting random segfault in one of our system, after some investigation it turns out to be due to a race condition in caml_get_raw_backtrace:
res = caml_alloc(caml_backtrace_pos, 0);
if(caml_backtrace_buffer != NULL) {
intnat i;
for(i = 0; i < caml_backtrace_pos; i++)
Field(res, i) = Val_Codet(caml_backtrace_buffer[i]);
}
caml_alloc might run a minor collection. The minor collection might run finalisers which might raise and catch exceptions, modifying the current backtrace. If [caml_backtrace_pos] ends up smaller because of this the end of [res] is garbage.
We'll push a fix today to at least avoid the segfault. This is still not completely satisfactory as this shows again that when you get a backtrace, you might get a completely random one.
Additional information
Here is a program that reproduce the bug, to be compiled with 'ocamlopt -g -inline 0':
let () = Printexc.record_backtrace true
let finaliser _ = try raise Exit with _ -> ()
let create () =
let x = ref () in
Gc.finalise finaliser x;
x
let f () = raise Exit
let () =
let minor_size = (Gc.get ()).Gc.minor_heap_size in
while true do
Gc.minor ();
try
ignore (create () : unit ref);
f ()
with _ ->
for i = 1 to minor_size / 2 - 1 do
ignore (ref ())
done;
ignore (Printexc.get_backtrace () : string)
done
The text was updated successfully, but these errors were encountered: