Browse thread
OCaml image blending performance
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Mauricio Fernandez <mfp@a...> |
| Subject: | Re: [Caml-list] OCaml image blending performance |
On Thu, Feb 07, 2008 at 11:01:54AM +0000, Richard Jones wrote: > On Wed, Feb 06, 2008 at 11:34:02PM +0000, Jon Harrop wrote: > > In this case, most of the speed loss can be regained by simply > > aggressively inlining everything, which is exactly what you (ab)used > > C macros for in the C code. > > I don't understand this. In 'blend2.ml' (which I was responsible for) > C macros are used to inline all the OCaml functions the same as in the > C version. Yet it's still 70% slower than the C version. > > My suspicion was that it was to do with his use of a string as a byte > array. I get a 30% speedup by unrolling the BLEND loop and performing some additional CSE (just getting rid of that extra (j_+3), actually). I think the major culprit is 31-bit integer arithmetic: there are lots of "orl $1, ...", "sarl $1, ...", decl and incl in the generated code. If Bigarray.Array1 had unsafe_get and unsafe_set, operating on int32 values could be faster (ocamlopt is often smart enough to do without boxed int32s). Currently, bigarray bound checking is performed even with -unsafe, however. -- Mauricio Fernandez - http://eigenclass.org