Version française
Home     About     Download     Resources     Contact us    
Browse thread
How to read different ints from a Bigarray?
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Goswin von Brederlow <goswin-v-b@w...>
Subject: Re: [Caml-list] How to read different ints from a Bigarray?
Gerd Stolpmann <gerd@gerd-stolpmann.de> writes:

> Am Mittwoch, den 28.10.2009, 14:54 +0100 schrieb Goswin von Brederlow:
>> Hi,
>> 
>> I'm working on binding s for linux libaio library (asynchron IO) with
>> a sharp eye on efficiency. That means no copying must be done on the
>> data, which in turn means I can not use string as buffer type.
>> 
>> The best type for this seems to be a (int, int8_unsigned_elt,
>> c_layout) Bigarray.Array1.t. So far so good.
>> 
>> Now I define helper functions:
>> 
>> let get_uint8 buf off = buf.{off}
>> let set_uint8 buf off x = buf.{off} <- x
>> 
>> But I want more:
>> 
>> get/set_int8 - do I use Obj.magic to "convert" to int8_signed_elt?
>> 
>> And endian correcting access for larger ints:
>> 
>> get/set_big_uint16
>> get/set_big_int16
>> get/set_little_uint16
>> get/set_little_int16
>> get/set_big_uint24
>> ...
>> get/set_little_int56
>> get/set_big_int64
>> get/set_little_int64
>> 
>> What is the best way there? For uintXX I can get_uint8 each byte and
>> shift and add them together. But that feels inefficient as each access
>> will range check and the shifting generates a lot of code while cpus
>> can usualy endian correct an int more elegantly.
>> 
>> Is it worth the overhead of calling a C function to write optimized
>> stubs for this?
>> 
>> And last:
>> 
>> get/set_string, blit_from/to_string
>> 
>> Do I create a string where needed and then loop over every char
>> calling s.(i) <- char_of_int buf.{off+i}? Or better a C function using
>> memcpy?
>> 
>> What do you think?
>
> A C call is too expensive for a single int (and ocamlopt). The runtime
> needs to fix the stack and make it look C-compatible before it can do
> the call. Maybe it's ok for an int64.
>
> Can you ensure that you only access the int's at word boundaries? If so,
> it would be an option to wrap the same malloc'ed block of memory with
> several bigarrays, e.g. you use an (int, int8_unsigned_elt, c_layout)
> Bigarray.Array1.t when you access on byte level, but an (int32,
> int32_unsigned_elt, c_layout) Bigarray.Array1.t when you access on int32
> level, but both bigarrays would point to the same block and share data.
> This is trivial to do from C, just create several wrappers for the same
> memory.

I actualy need 512 byte aligned (better page aligned) data so that is
definetly a possibility if only aligned access is required.

> The nice thing about bigarrays is that the compiler can emit assembly
> instructions for accessing them. Much faster than picking bytes and
> reconstructing the int's on the caml side. However, if you cannot ensure
> aligned int's the latter is probably unavoidable.

So a.{i} <- x is not a C call. That is good to know.

That leaves only the problem of endian conversion. I guess I could
live with reading the int and shifting the bytes around for the rare
cases of endianess of cpu and data differing. I might even not bother
providing that since I don't need it at all.

> Btw, I would be interested in your aio bindings if you do them as open
> source project.

See other mail. There is also an libfuse-ocaml that uses libaio-ocaml
(althout that source is already in git instead of svn) if you want to
see some more extensive use than the test.ml.

> Gerd

MfG
        Goswin