Version française
Home     About     Download     Resources     Contact us    
Browse thread
OCaml runtime using too much memory in 64-bit Linux
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Xavier Leroy <Xavier.Leroy@i...>
Subject: Re: [Caml-list] OCaml runtime using too much memory in 64-bit Linux
Hello,

Concerning this issue with large page tables on 64-bit architectures,
I opened a problem report on the bug-tracking system to help gather
more information.  I'd like to ask all members of this list that
reported the problem to kindly visit

         http://caml.inria.fr/mantis/view.php?id=4448

and add the required information as a note.  That will help
pinpointing the problem.

Some more explanation on what's going on.  The Caml run-time system
needs to keep track of which memory areas belong to the major heap,
and uses a page table for this purpose, with a dense representation
(an array of bytes).  If the major heap areas are closely spaced, this
table is very small compared with the size of the heap itself.
However, if these areas are widely spaced in the addressing space, the
table can get big.

For 32-bit platforms, this isn't much of a problem since the maximum
size of the page table is 1 megabytes.  For 64-bit platforms, the sky
is the limit, however.  So far, the only 64-bit platform where this
has been a problem in the past is Linux with glibc, where blocks
allocated by malloc() can come either from sbrk() or mmap(), two areas
that are spaced several *exa*bytes apart.  We worked around the
problem by allocating all major heap areas directly with mmap(),
obtaining closely spaced addresses.

Apparently, this trick is no longer working on some systems, but I
need to understand better which ones exactly.  (I suspect some Linux
distros that applied address randomization patches to the stock Linux
kernel.)  So, please provide feedback in the BTS.

If the problem is confirmed, there are several ways to go about it.
One is to implement the page table with a sparse data structure,
e.g. a hash table.  However, the major GC and some primitives like
polymorphic equality perform *lots* of page table lookups, so a
performance hit is to be expected.  The other is to revise OCaml's
data representations so that the GC and polymorphic primitives no
longer need to know which pointers fall in the major heap.  This seems
possible in principle, but will take quite a bit of work and break a
lot of badly written C/OCaml interface code.  You've been warned :-)

- Xavier Leroy