Version française
Home     About     Download     Resources     Contact us    
Browse thread
Re: [Caml-list] How to write a CUDA kernel in ocaml?
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Mattias EngdegÄ rd <mattias@v...>
Subject: Re: [Caml-list] How to write a CUDA kernel in ocaml?
>And trampolines to eliminate tail calls that cannot be eliminated using goto. 
>However, trampolines are ~10x slower than TCO in the code gen.

With some care, gcc's sibcall mechanism can be exploited. For example,
by having one standard signature for all generated C functions, and
taking care not to pass pointers to variables in the caller's stack
frame. This should give fairly good performance (better than
trampolines anyway), at the cost of portability (but gcc is good at
that). It would give full TCO, even across compilation units. It
should work well with a Cheney-on-the-MTA-style GC, too.

How suitable it is depends on the reason why compilation to C is done in
the first place. It might be one of:

1) portability to odd platforms with semi-decent performance (ie,
   better than interpreted bytecode)
2) a simple target for maintaining bootstrapping capability for the
   compiler (but bytecode works well for this too)
3) simpler (?) interfacing to libraries in C etc
4) flat-out maximum performance by exploiting the optimisations that
   modern C compilers are capable of

Of course, these days we have llvm which has a lot going for it.