Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
[Caml-list] Does Caml have slow arithmetics ?
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Basile Starynkevitch [local] <basile.starynkevitch@i...>
Subject: Re: [Caml-list] Does Caml have slow arithmetics ?
On Wed, Jul 07, 2004 at 02:32:56PM +0200, Diego Olivier Fernandez Pons wrote:

(I, Basile Starynkevitch, replied:)

> > In principle yes, because int values are represented as tagged
> > (31bits) ints (with the LSB set to 1). So conversion is a shift
> > followed by an addition (or viceversa).
> Subsidiary question : why LSB instead of MSB ? I thought it would be
> simpler to let the computations silently overflow and correct when
> necessary the tag bit.

I cannot answer for sure, since I did not made the choice, but I see
some reasons for it.

values which are nontagged integers are pointers to blocks. For
obvious reasons, you want to access the block's field as quickly as
possible, so the pointer is in fact a machine pointer to the block
(with perhaps a constant offset).

using the MSB as a tag bit would require that every Ocaml value is in
some restricted address range (like e g [0 ... 2^30] on 32 bits
machine). And having the heap always inside this range is not doable
without tricky system hacks (on Linux, mmap MAP_NORESERVE very early
in the Ocaml startup)

More simply, Ocaml's GC (IIRC) allocates memory (in big chunks) with
malloc, not mmap. Hence, it has no idea of what the memory address is,
and what is its MSB.

All above is an educated guess (from reading the code and some
discussions with Xavier L. & Damien D.) - I may be grossly wrong.

I did read many GC related papers, and all the tagging schemes I
remember having read about (for some suitably short integers) uses the
LSB, not the MSB. So, it is probably a well established habit.

If you want to start a functional language implementation from
scratch, you might (as Xavier Leroy told me) consider tagging integers
with a 0 LSB, and having GC-ed pointers with a 1 LSB. Hence, ordinary
C pointers (outside your heap) are handled as integers and remain
untouched by the GC. This insight is Xavier's, not mine!

(Old dinosaurs like me also remember the tagged-add machine
instruction on the Sparc, which had the tags of int in the two lowest
bits of the word).

This is the first time I'm hearing about tagging integer in the MSB,
and I have no more convincing arguments - maybe it is just a habit
(and maybe 64 bits processors might change it - since they won't fill
their address space for a long time).

Basile STARYNKEVITCH -- basile dot starynkevitch at inria dot fr
Project - phone +33 1 3963 5197 - mobile 6 8501 2359 --- all opinions are only mine 

To unsubscribe, mail Archives:
Bug reports: FAQ:
Beginner's list: