On Thu, Mar 23, 2000 at 01:08:54PM +1100, Max Skaller wrote:
> Sven LUTHER wrote:
> >
> > On Wed, Mar 22, 2000 at 09:22:15AM +1100, John Max Skaller wrote:
> > > I have some code for processing ISO-10646 characters and UTF-8,
> > > which uses caml integers. ISO-10646 has 2^31 code points, which
> > > can be covered by caml integers on a 32bit machine. Using an
> > > unboxed type is mandatory for performance.
> > >
> > > Unfortunately, caml integers are signed, which makes most of the
> > > code I have written wrong (I haven't taken the care to handle
> > > integers over 2^30 correctly).
> > >
> > > What is the best way to handle this problem?
> > > Would a (standard?) library module (written in C), that treats
> > > integers as unsigned be a reasonable solution?
> > >
> > > [This may require writing 'uint_add x y' instead of 'x+y',
> > > but that doesn't matter in the above mentioned application,
> > > since the integers are being used to represent characters]
> >
> > Just use the caml integer and ignore the fact that they are signed ?
> >
> > after the moto : that doesn't matter in the above mentioned application,
>
> Perhaps my explanation was unclear. In my code, I must
> calculate a UTF-8 encoding from a ISO-10646 code point,
> and calculate an ISO-10646 code point from a UTF-8 encoding.
>
> The code is below. The code works for values <2^30,
> but fails when and int goes negative.
>
> I would be happy to replace, in this code,
> evey use of 'lor', 'land', + - * < etc with
> 'ulor' 'uland' 'uplus' 'uminus' 'uless' etc, if only
> I could define them. (I could do this in C .. but then,
> I could write the below routines in C too)
>
just redefine the above mentioned operations in caml, taking the overflow in
account, it should not be too difficult, altough it should be a bit less
efficient than the normal +, -, ... (altough i am not sure about it, maybe you
could just ignore it and use the normal functions. At least for + and - it
should work without problem.
to test them, use a function to print the type as unsigned int, and use
#install_printer to use it as default printer for ints.
Friendly
Sven LUTHER
This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 13:59:19 MET