English version
Accueil     À propos     Téléchargement     Ressources     Contactez-nous    

Ce site est rarement mis à jour. Pour les informations les plus récentes, rendez-vous sur le nouveau site OCaml à l'adresse ocaml.org.

Browse thread
Re: [Caml-list] Width subtyping
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2009-06-01 (14:21)
From: David Allsopp <dra-news@m...>
Subject: RE: [Caml-list] Width subtyping
Dario Teixeira wrote:
> Thanks -- that is also an interesting solution.  I'm guessing it will
> be faster, though it might consume more memory in cases where only one
> field is actually used.  I'll have to try it side-by-side with the
> object based solution to see how they compare in the real world with my
> problem..

So as I wouldn't immediately be told "the overhead is irrelevant", I ran a
"quick" benchmark before my last post. I compared summing 7 million
3-int-field records and 7 million 3-int-field objects using names a, b, c.
On my machine, averaged out and with enough RAM that neither paging nor the
GC get in the way, objects were nearly 3 times slower. I then tried with
10-int-field records and objects - in this case, objects were just over 4
times slower (read on). This was in bytecode - I didn't bother with native

> Which might turn out to be not that big a deal.  After all, the object
> based solution also adds some default overhead per object created,
> right?

The overhead of an object is 2 words (one contains tracking information
about the actual class of the object and the other is a unique identifier) +
(I think) 1 extra word for every field (because each int is boxed in a
1-tuple so that its tag can be recorded). Importantly, accessing fields in a
record is an O(1) operation but for objects it's O(log n) for the number of
fields because a binary search is done to locate the field with the correct
tag. ocamlopt may of course optimise this search by caching the absolute
index of the field in a "recognised" object layout; I didn't look in the
compiler. My test between 3 and 10 fields would suggest that this
optimisation does not apply in bytecode.