Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
[Caml-list] Bigarray map & set/get (long)
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Chris Hecker <checker@d...>
Subject: Re: [Caml-list] Bigarray map & set/get (long)

>In the applications for which Bigarray was initially intended, the
>Caml code that manipulates directly the bigarrays isn't
>time-critical: the time-critical computations are done by external
>libraries such as BLAS, Lapack, etc.  Your matrix multiplication code
>is a good example: if you care about its performances, then you need
>to make it a lot more sophisticated so that it will be cache-friendly
>(e.g. blocking); better use an existing, well-tuned C or Fortran
>implementation than try to do your own in Caml.

The problem with this is that sometimes you don't want the discontinuity 
and inconvenience of calling to C.  Obviously, it'd be nice if we could do 
everything in ocaml from a simplicity and consistency standpoint, assuming 
it's not an infinite amount of work to get there.

There is an important middle ground between "not caring about speed" and 
"needing the highest end BLAS performance", and since CPUs are making bad 
code fast faster than they're making good code fast, the middle ground is 
moving higher up the importance ladder, and getting easier to attain.

When I looked at it a few months ago, there actually only seem to be a few 
things that are needed to make bigarrays as efficient as C float * arrays 
for most operations.  I don't have my list handy, but when I get around to 
optimizing my game I hope to implement some of these into the 
compiler.  Off the top of my head, I think bounds checking made a 
measurable difference, as did the indirection in the way the bigarray 
header structures are stored on the heap (even when they're going through 
the optimized path in the compiler), and it would be easier to write a 
lapack-style modularized matrix library if there was the concept of taking 
a "pointer" into a 1D bigarray that was lower level than the currently 
exising slice and subarray stuff (so that you can pass a pointer to a 
submatrix + a stride around).


To unsubscribe, mail Archives:
Bug reports: FAQ:
Beginner's list: