English version
Accueil     À propos     Téléchargement     Ressources     Contactez-nous    

Ce site est rarement mis à jour. Pour les informations les plus récentes, rendez-vous sur le nouveau site OCaml à l'adresse ocaml.org.

Browse thread
How to read different ints from a Bigarray?
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2009-10-28 (19:05)
From: Goswin von Brederlow <goswin-v-b@w...>
Subject: Re: [Caml-list] How to read different ints from a Bigarray?
Xavier Leroy <Xavier.Leroy@inria.fr> writes:

> Goswin von Brederlow wrote:
>> I'm working on binding s for linux libaio library (asynchron IO) with
>> a sharp eye on efficiency. That means no copying must be done on the
>> data, which in turn means I can not use string as buffer type.
>> The best type for this seems to be a (int, int8_unsigned_elt,
>> c_layout) Bigarray.Array1.t. So far so good.
> That's a reasonable choice.

Actualy signed seems better. Easier to get an int and mask out the
lower 8 bit to get unsigned then sign extend. Or?

>> Now I define helper functions:
>> let get_uint8 buf off = buf.{off}
>> let set_uint8 buf off x = buf.{off} <- x
>> But I want more:
>> get/set_int8 - do I use Obj.magic to "convert" to int8_signed_elt?
> Not at all.  If you ask OCaml's typechecker to infer the type of
> get_uint8, you'll see that it returns a plain OCaml "int" (in the
> 0...255 range). Likewise, the "x" parameter to "set_uint8" has type
> "int" (of which only the 8 low bits are used).

The point was to make get_int8 to return an int in the -128..127
range and get_uint8 in the 0..255 range. That both are int doesn't

> Repeat after me: "Obj.magic is not part of the OCaml language".

Somebody else suggested to create an (int, int8_unsigned_elt,
c_layout) Bigarray.Array1.t and (int, int8_signed_elt,
c_layout) Bigarray.Array1.t and (int, int16_unsigned_elt,
c_layout) Bigarray.Array1.t and (int, int16_signed_elt,
c_layout) Bigarray.Array1.t and ... that all point to the same block
of bits. As evil as Obj.Magic I guess but might work nicely.

>> And endian correcting access for larger ints:
>> get/set_big_uint16
>> get/set_big_int16
>> get/set_little_uint16
>> get/set_little_int16
>> get/set_big_uint24
>> ...
>> get/set_little_int56
>> get/set_big_int64
>> get/set_little_int64
> The "56" functions look like a bit of overkill to me :-)

For one part I am storing keys in there consisting of

struct Key {
  uint64_t type:8; // enum { TYPE1, TYPE2, TYPE3, ... };
  uint64_t inode:56;
  uint64_t data;

That gives a nice 16 bytes for a key but requires splitting the first
uint64_t into 8 and 56 bit.  I could provide only get_int64 and split
that in ocaml but what the hell. A function more or less doesn't kill

>> What is the best way there? For uintXX I can get_uint8 each byte and
>> shift and add them together. But that feels inefficient as each access
>> will range check
> Not necessarily.  OCaml 3.11 introduced unchecked accesses to
> bigarrays, so you can range-check yourself once, then perform
> unchecked accesses.  Use with caution...

I'm always verry cautious of such. In the existing code I already
needed some unsafe_string that I really didn't like. Need to add
phantom types to get rid of them some day.

>> and the shifting generates a lot of code while cpus
>> can usualy endian correct an int more elegantly.
>> Is it worth the overhead of calling a C function to write optimized
>> stubs for this?
> The only way to know is to benchmark both approaches :-(  My guess is
> that for 16-bit accesses, you're better off with a pure Caml solution,
> but for 64-bit accesses, a C function could be faster.
> - Xavier Leroy

Writing benchmark code, writing, writing. Now where is that big endian
cpu to test converting from little endian? :)))