OCAML native code profiler for x86+gcc

From: Aleksey Nogin (nogin@cs.cornell.edu)
Date: Fri Jul 17 1998 - 02:42:29 MET DST

Date: Thu, 16 Jul 1998 20:42:29 -0400
From: Aleksey Nogin <nogin@cs.cornell.edu>
To: caml-list@inria.fr
Subject: OCAML native code profiler for x86+gcc


I wrote a small patch that makes possible to produce the profiled code
using ocamlopt. Of course even without any patch you can always pass
"-ccopt -p" option to ocamlopt and get some profiling information, but
with "-ccopt -p" you would only get the "flat" profile, while with my
patch you will also get the full call graph, including the information
    - how many times each function was called
    - which functions were called from some particular function and how
many times
    - how much time was spent in each function
    - how much time was spent inside each function including the
function itself, its children, grandchildren, grand-grandchildren, etc.;
how this time is distributed among the children functions
    - which functions called some particular function

I wrote my patch to mimic the behavior of "gcc -pg" - the patch ocamlopt
works in exactly the same way as the usual ocamlopt unless it is given
the "-p" option. When given the "-p" option, the patched ocamlopt:
    - inserts calls to the counter function, mcount, at the beginning of
each function
    - passes the "-pg" function to gcc
    - links the code against libasmprun library, which is the profiled
version of the usual libasmrun - it is compiled with "-pg" option and it
also has i386.p.S (which is the profiled "by hands" version of i386.S)
compiled in.

I tested my patch only on RedHat Linux 5.1 system, but it will probably
work fine on any other x86 + gcc system and I believe that it should not
be too hard to port it for use with other C compilers (you will probably
need to change "-pg" to "-p" and "mcount" to something else) or even to
some other hardware architecture.

Of course there is NO WARRANTY - it works for me, but you are using it
at your own risk.

To use the profiler:
0) Get my patch from
http://www.cs.cornell.edu/nogin/patches/ocaml-1.07-profile.patch.txt and
recompile ocaml
1) Recompile your code with "-p" option (you do not have to recompile
.mli files, only .ml files).
2) Run your code. It should generate the "gmon.out" file.
3) Run "gprof <program executable>".

Please note:
    - The profiled binary will run much slower and the timing
information would not be completely accurate.
    - You would not get much information for function that were compiled
without "-p" option (e.g. standard library)
    - If you want to get very accurate call graph (at expense of less
accurate timing information), use "-p -compact -inline 0".
    - If some function is called more than 2^32 times, the profiling
information becomes wrong. This happens especially fast when you compile
you code with "-compact" option and everybody starts calling caml_alloc
functions (caml_alloc, caml_alloc2, caml_alloc3 and caml_alloc4).

Home Page: http://www.cs.cornell.edu/nogin/
E-Mail: nogin@cs.cornell.edu (office), ayn2@cornell.edu (home)
Office: Upson 5162, tel: 1-607-255-7421

This archive was generated by hypermail 2b29 : Sun Jan 02 2000 - 11:58:15 MET