Version française
Home     About     Download     Resources     Contact us    
Browse thread
How to read different ints from a Bigarray?
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Gerd Stolpmann <gerd@g...>
Subject: Re: [Caml-list] How to read different ints from a Bigarray?

Am Mittwoch, den 28.10.2009, 14:54 +0100 schrieb Goswin von Brederlow:
> Hi,
> 
> I'm working on binding s for linux libaio library (asynchron IO) with
> a sharp eye on efficiency. That means no copying must be done on the
> data, which in turn means I can not use string as buffer type.
> 
> The best type for this seems to be a (int, int8_unsigned_elt,
> c_layout) Bigarray.Array1.t. So far so good.
> 
> Now I define helper functions:
> 
> let get_uint8 buf off = buf.{off}
> let set_uint8 buf off x = buf.{off} <- x
> 
> But I want more:
> 
> get/set_int8 - do I use Obj.magic to "convert" to int8_signed_elt?
> 
> And endian correcting access for larger ints:
> 
> get/set_big_uint16
> get/set_big_int16
> get/set_little_uint16
> get/set_little_int16
> get/set_big_uint24
> ...
> get/set_little_int56
> get/set_big_int64
> get/set_little_int64
> 
> What is the best way there? For uintXX I can get_uint8 each byte and
> shift and add them together. But that feels inefficient as each access
> will range check and the shifting generates a lot of code while cpus
> can usualy endian correct an int more elegantly.
> 
> Is it worth the overhead of calling a C function to write optimized
> stubs for this?
> 
> And last:
> 
> get/set_string, blit_from/to_string
> 
> Do I create a string where needed and then loop over every char
> calling s.(i) <- char_of_int buf.{off+i}? Or better a C function using
> memcpy?
> 
> What do you think?

A C call is too expensive for a single int (and ocamlopt). The runtime
needs to fix the stack and make it look C-compatible before it can do
the call. Maybe it's ok for an int64.

Can you ensure that you only access the int's at word boundaries? If so,
it would be an option to wrap the same malloc'ed block of memory with
several bigarrays, e.g. you use an (int, int8_unsigned_elt, c_layout)
Bigarray.Array1.t when you access on byte level, but an (int32,
int32_unsigned_elt, c_layout) Bigarray.Array1.t when you access on int32
level, but both bigarrays would point to the same block and share data.
This is trivial to do from C, just create several wrappers for the same
memory.

The nice thing about bigarrays is that the compiler can emit assembly
instructions for accessing them. Much faster than picking bytes and
reconstructing the int's on the caml side. However, if you cannot ensure
aligned int's the latter is probably unavoidable.

Btw, I would be interested in your aio bindings if you do them as open
source project.

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann, Bad Nauheimer Str.3, 64289 Darmstadt,Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
Phone: +49-6151-153855                  Fax: +49-6151-997714
------------------------------------------------------------