How to handle endianness and binary string conversion for 32 bits integers (Int32)?

David MENTRE
 Nicolas George
 Nicolas Cannasse
[
Home
]
[ Index:
by date

by threads
]
[ Message by date: previous  next ] [ Message in thread: previous  next ] [ Thread: previous  next ]
[ Message by date: previous  next ] [ Message in thread: previous  next ] [ Thread: previous  next ]
Date:   (:) 
From:  Nicolas George <nicolas.george@e...> 
Subject:  Re: [Camllist] How to handle endianness and binary string conversion for 32 bits integers (Int32)? 
L'octidi 28 prairial, an CCXIII, David MENTRE a écrit : > 1. convert between big and little endian 32 bits integers; Don't do that. > 2. convert between 32 bits integers and string binary representation > (to store integers in Buffer and string data structures); What you mean to do is represent an integer in a bounded interval as a fixedlength sequence of finitevalued objects. Said that way, children learn how to do it in school: it's writing the number in some base. Since bytes in a string can take 256 values, one will obviously use base 256. The first (rightmost) "digit" will be (n mod 256). The second "digit" will be ((n / 256) mod 256). The third "digit" will be ((n / (256 * 256)) mod 256) The fourth (leftmost) "digit" will be ((n / (256 * 256 * 256)) mod 256). And so on, but since your numbers are less than 256*256*256*256, all remaining "digits" are 0. So all you have to do is store these four bytes in your string, in any order you may prefer. "Big endian" is when you store the fourth, the third, the second and the first; it is the nearest to the way we humans write numbers; and the lexical order is the same as the numeric order. "Small endian" is when you store the first, the second, the third and the fourth. But, and that is important, this does not depend on the hardware it runs on: it is purely arithmetic. The reverse operation is simply n = d1 + d2 * 256 + d3 * 256 * 256 + d4 * 256 * 256 * 256 > 3. detect machine endianness at runtime. Don't do that. I develop: there are no guarantees that numbers are either in big or little endian. I have heard that some architectures exist where 8bits bytes in 16bits words are in little endian, but 16bits words in 32bit words are in big endian, which gives 3412 as a global order. Using the internal representation of integers can so never be reliable. On the contrary, compilers ensure that arithmetic in reasonable interval is the real Peano arithmetic, for all architectures. Using the internal representation of numbers may allow to gain some cycles on the packingunpacking, but it is probably nothing in regard to anything that will be done with the data (disc access or network for example). Furthermore, if you have to worry about inverting the order of the bytes in the number, the gain will be even smaller.