New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCaml + frame-pointer on + clang (and perhaps GCC) #7417
Comments
Comment author: cullmann The lines quoted above got introduced for |
Comment author: @mshinwell I think you do need 16-byte alignment here. The problem is probably that Clang has generated some instruction that faults if an operand isn't 16-byte aligned. |
Comment author: cullmann Perhaps other places modified with that #ifdef needs review, too. Given problems occur only for some instruction generations in the C compiler that is a hard to track thingy. |
Comment author: @gasche
You don't need frame pointers for usable perf profiling, you can use the DWARF information produced by (ocamlopt -g) with the --call-graph=dwarf option: https://ocaml.org/learn/tutorials/performance_and_profiling.html#UsingperfonLinux (Using perf on binaries without -g will give you per-function costs, but not aggregated by call stacks.) |
Comment author: cullmann
We tried that, like for C/C++ the resulting stack frame chains are much more imprecise and more often broken for large things. |
Comment author: @mshinwell You also need a fairly recent version of perf (and kernel maybe?) for DWARF/perf profiling. |
Comment author: cullmann
|
Comment author: @mshinwell I will investigate making a patch for this issue, and see if I can spot any other cases where the alignment is wrong. |
Comment author: cullmann I would propose a patch like: --- a/src/asmrun/amd64.S
#endif
#endif I inspected the other #ifdef'd locations, they look not wrong to me. |
Comment author: cullmann Created small pull request for that change: |
Comment author: cullmann I can now confirm that all company internal regression tests work with that again (e.g. clang + -with-frame-pointers does again the same as without the option) |
Comment author: @xavierleroy Patch merged on trunk, commit 3ad607f. OK for closing? or should it also go on 4.04 release branch? |
Comment author: cullmann If there is some 4.04.x patch release, I think it would make sense to have it in the branch for that. |
Comment author: @gasche I understand Damien's initial triaging as an indication of support for a 4.0x fix, so I'll also cherry-pick in 4.04. Thanks to Xavier for the triaging and being careful about that. |
Original bug ID: 7417
Reporter: cullmann
Assigned to: @gasche
Status: resolved (set by @gasche on 2016-11-27T19:16:36Z)
Resolution: fixed
Priority: normal
Severity: minor
Platform: Linux 64-bit
OS: openSUSE
OS Version: 42.x
Version: 4.04.0
Target version: 4.04.1+dev
Fixed in version: 4.04.1+dev
Category: back end (clambda to assembly)
Tags: patch
Monitored by: @gasche @jmeber
Bug description
We tried to use OCaml together with the $with_frame_pointers option to have usable perf profiling. As we use clang and not gcc, we were so free to activate that config in configure, too:
if test "$with_frame_pointers" = "true"; then
case "$target,$cc" in
x86_64--linux,gcc*|x86_64--linux,clang*)
nativecccompopts="$nativecccompopts -g -fno-omit-frame-pointer"
bytecccompopts="$bytecccompopts -g -fno-omit-frame-pointer"
nativecclinkopts="$nativecclinkopts -g"
echo "#define WITH_FRAME_POINTERS" >> m.h
;;
*) err "Unsupported architecture with frame pointers";;
esac
fi
Unfortunately, that leads to random segfaults e.g. during unit test execution.
(mostly inside gc routines)
As clang normally behaves just like gcc (at least for the C code the ocaml stuff contains), this looks strange. After looking bit at the code to search potential issues that could lead to this, I found in amd64.S that lines:
LBL(caml_allocN):
pushq %rax; CFI_ADJUST(8) /* save desired size /
subq %rax, %r15
CMP_VAR(caml_young_limit, %r15)
jb LBL(103)
addq $8, %rsp; CFI_ADJUST (-8) / drop desired size /
ret
LBL(103):
CFI_ADJUST(8)
RECORD_STACK_FRAME(8)
#ifdef WITH_FRAME_POINTERS
/ Do we need 16-byte alignment here ? /
ENTER_FUNCTION
#endif
call LBL(caml_call_gc)
#ifdef WITH_FRAME_POINTERS
LEAVE_FUNCTION
#endif
popq %rax; CFI_ADJUST(-8) / recover desired size */
jmp LBL(caml_allocN)
CFI_ENDPROC
=> I would say you need 16-byte alignment here, given older bugs like
https://caml.inria.fr/mantis/print_bug_page.php?bug_id=6038
=> can it be that this ifdef breaks that? This would mean that with frame pointers is broken for GCC, too, which makes that assumption, but perhaps clang more often triggers that.
Steps to reproduce
See above, patch configure and compile with clang + frame pointers (we use clang 3.9.0)
The text was updated successfully, but these errors were encountered: