New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
int31 type for Bigarray #4909
Comments
Comment author: goswin The reason I limited myself to 31bit ints and want to use (int, int, c_layout) Array1.t is speed. The overhead for int32 in the code is substantial, even when just converting to/from int32 on array read/write. The code runs a lot slower on 64bit. |
Comment author: @gasche Sorry for the non-reaction. I had a look at the patch and it seems reasonable (no regression potential, useful feature). I'm not very familiar with the Bigarray stuff, so a second opinion would be worthwhile, but I think this feature could be integrated upstream. Doing so would require to first rebase it against the new caml-side implementation of Bigarrays, using GADTs. Would you be willing to do that? (Of course you could decide to wait for firmer opinions about integration to invest more work.) Also reading the patch, there are some aspects of the design that I found unclear -- but it's probably my own unfamiliarity with the problem domain. You describe this new type of stored integers as +- Small Caml integers (signed, 31 bits on 32-bit architectures,
but isn't the purpose to have code that behaves in the same way on 32- and 64-bits-integers architectures? If that is the case, wouldn't it make more sense to only document 31-bits width? And possible, normalize to a 31-bits integer even when running on 64-bits machine to make the result of writing then reading back reproducible on both architectures? |
Comment author: @alainfrisch I can see the value of not specifying a 31-bit truncation on 64-bit machines (for performance purposes), but then the naming is problematic. Would the truncation degrade performance significantly? |
Comment author: @paurkedal You could make the naming clear by adopting something analogous as the C99 stdint.h names like int_least32_t, i.e. e.g. int_least31_elt. |
Comment author: goswin The idea was to have the same memory footprint and data format on 32bit and 64bit without computational overhead. What happens on overflow or underflow is undefined so leaving the extra bit on 64bit does not change much. Undefined is undefined. I haven't measured truncating versus not truncating. As the behaviour is already undefined on 32bit systems I simply didn't care. On the other hand the extra bit could be useful on 64bit systems if one doesn't care about compatibility with 32bit. I'm willing to take another look at the Bigarray code and rebase / update the patch. I also like the idea of using int_least31_elt to better reflect that 31bit are garanteed but 64bit actually has 32bit. Should I go with that? |
Comment author: goswin |
#654 has been closed, so closing this one too. |
Original bug ID: 4909
Reporter: goswin
Status: acknowledged (set by @damiendoligez on 2009-12-08T16:31:49Z)
Resolution: open
Priority: normal
Severity: feature
Category: otherlibs
Tags: patch
Monitored by: tgazagna goswin @hcarty
Bug description
I currently have the problem that I want to read/write 31bit signed integer values to a Bigarray. The bigarray acts as a buffer that is stored on disk or restored from it from a C library.
On a 32bit CPU I can just use (int, int, c_layout) Array1.t but on a 64bit CPU I need to use (int32, int32, c_layout) Array1.t. This means I need 2 flavours of the code depending on the number of bits the cpu has.
It would be nice if there were a (int, int31_elt) kind in Bigarray allowing the use of int on 32bit and 64bit platforms.
File attachments
The text was updated successfully, but these errors were encountered: