> I have some code for processing ISO-10646 characters and UTF-8,
> which uses caml integers. ISO-10646 has 2^31 code points, which
> can be covered by caml integers on a 32bit machine. Using an
> unboxed type is mandatory for performance.
OCaml 3.00 includes three new library modules, Int32, Int64 and
Nativeint, implementing (boxed) 32-bit, 64-bit and platform-native
integers, resepctively. (Platform-native integers are 32 bits on 32
bit processors and 64 bits on 64 bit processors). The native-code
compiler was modified to inline the operations on those types,
including elimination of unnecessary boxing/unboxing, like for floats.
That may or may not be efficient enough for your application.
> Unfortunately, caml integers are signed, which makes most of the
> code I have written wrong (I haven't taken the care to handle
> integers over 2^30 correctly).
Actually, on 2's-complement machines at least, arithmetic operations
over usigned integers are exactly identical to those over signed
integers of the same size, except divisio, modulus, and
comparisons <, >, <=, >=. So, for your application, Caml's "int"
type could be good enough, although you may need special comparison
functions (which you can write in C, using casts to unsigned long int,
or in Caml, by treating the sign bit specially).
- Xavier Leroy
This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 13:42:41 MET