Browse thread
time profiling and nested function inlining
- Quôc_Peyrot
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: | 2006-12-06 (03:43) |
From: | Quôc_Peyrot <chojin@l...> |
Subject: | time profiling and nested function inlining |
Hello, I have two questions, sorry if they have already been asked, but I searched through the archives and couldn't find the answers: - I tried to do some time profiling (Mac OS X, ppc G4) but for some reasons it doesn't seem to work. I compiled with OCamlMakefile using the command line "make profiling-native-code". When I execute the program, it does generate gmon.out, but when I run gprof the only thing I get is: called/total parents index %time self descendents called+self name index called/total children 0.00 0.00 1/1 ___fixunsdfdi [476] [21] 0.0 0.00 0.00 1 ___stub_getrealaddr [21] ----------------------------------------------- and % cumulative self self total time seconds seconds calls ms/call ms/call name 0.0 0.00 0.00 1 0.00 0.00 ___stub_getrealaddr [21] Thus, I'm wondering whether or not time profiling is supported on PPC G4. And if it is, can someone give me some clue to debug this issue? If it isn't, I would appreciate if someone could give me alternative solutions (apart from using an intel computer ;) ) - I was looking at the asm output to get familiar with efficient coding style, and I tried the following example: let rec log2_acc value acc = if value = 0 then acc else log2_acc (value lsr 1) (acc + 1) let log2 value = log2_acc value 0 which compiles to (using "ocamlopt -inline 100 -unsafe -S) _camlTest_regular__log2_acc_57: L101: cmpwi r3, 1 bne L100 mr r3, r4 blr L100: srwi r5, r3, 1 ori r3, r5, 1 addi r4, r4, 2 b L101 .globl _camlTest_regular__log2_60 .text .align 2 _camlTest_regular__log2_60: L102: li r4, 1 b _camlTest_regular__log2_acc_57 Although log2_acc could have been inlined (which might not be beneficial in this case anyway), it looks quite ok. But when I tried with a nested function: let log2 value = let rec log2_acc value acc = if value = 0 then acc else log2_acc (value lsr 1) (acc + 1) in log2_acc value 0 I got the following output: _camlTest_nested__2: .long _caml_curry2 .long 5 .long _camlTest_nested__log2_acc_59 .globl _camlTest_nested__log2_acc_59 .text .align 2 _camlTest_nested__log2_acc_59: L101: cmpwi r3, 1 bne L100 mr r3, r4 blr L100: srwi r5, r3, 1 ori r3, r5, 1 addi r4, r4, 2 b L101 .globl _camlTest_nested__log2_57 .text .align 2 _camlTest_nested__log2_57: L102: addis r4, 0, ha16(_camlTest_nested__2) addi r4, r4, lo16(_camlTest_nested__2) li r4, 1 b _camlTest_nested__log2_acc_59 I'm wondering what these computations before the call are (frame?) and why the compiler couldn't get rid of them. Not that I am utterly concerned by these small extra computations... I'm just curious. Thanks for the help/explanations, -- Best Regards, Quôc