Function inlining and functor
[
Home
]
[ Index:
by date

by threads
]
[ Message by date: previous  next ] [ Message in thread: previous  next ] [ Thread: previous  next ]
[ Message by date: previous  next ] [ Message in thread: previous  next ] [ Thread: previous  next ]
Date:   (:) 
From:  Quôc_Peyrot <chojin@l...> 
Subject:  Function inlining and functor 
Hello, I have a program with a generic function which takes a function as a parameter and calls it heavily. Something along the lines of: let toto f = (* call f a couple of million times *) I was trying to see wether or not I could force the inlining of "f" when f is small function. For the sake of simplicity, let's imagine we have: let toto f = let a = ref 0 in for i = 0 to 10 do a := !a + f i done; !a let f a = a * a let _ = print_endline (string_of_int (toto f)) of course we can see that f is not inlined in the inner loop: (PPC) L106: lwz r4, 0(r1) lwz r17, 0(r4) mtctr r17 > prepare the call L108: bctrl > call it I tried to use a functor, hoping that it would help the compiler to inline the function: module type A = sig val f: int > int end module Make (F:A) = struct let toto () = let a = ref 0 in for i = 0 to 10 do a := !a + F.f i done; !a end let f x = x * x module Mod = Make (struct let f = f end) let _ = print_endline (string_of_int (Mod.toto ())) but it doesn't seem to help at all, I can still see the call in my inner loop: L109: lwz r5, 0(r1) lwz r19, 8(r5) lwz r4, 0(r19) lwz r17, 0(r4) mtctr r17 L114: bctrl I was in fact hoping to get the same results than in C++ using metaprogramming/template: #include <iostream> using namespace std; template<class F> class Mod { public: int toto() { int res = 0; for (int i = 0; i <= 10; ++i) res += F::f(i); return res; } }; class Foo { public: static int f(int i) { return i * i; } }; int main(int argc, char**argv) { Mod<Foo> mod; cout << mod.toto() << endl; return 0; } which gives this nice inlining: L15: mullw r0,r2,r2 addi r2,r2,1 add r4,r4,r0 bdnz L15 addis r2,r31,ha16(L__ZSt4cout$non_lazy_ptr"L00000000002$pb") lwz r3,lo16(L__ZSt4cout$non_lazy_ptr"L00000000002$pb")(r2) Am I out of luck to get similar performance than C++? Thanks,  Best Regards, Quôc