Browse thread
large hash tables
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | Brian Hurt <bhurt@j...> |
| Subject: | Re: [Caml-list] large hash tables |
John Caml wrote:
>The equivalent C++ program uses 874 MB of memory in total. Each of the
>1 million records is stored in a vector using 1 single-precision float
>and 1 int. Indeed, my machine is AMD64 so Ocaml int's are presumably 8
>bytes.
>
>
C int's on AMD64 are still 4 bytes- longs are 8 bytes. You can prove
this by compiling a quick program:
#include <stdio.h>
int main(void) {
printf("Ints are %lu bytes long.\n", (unsigned long) sizeof(int));
return 0;
}
>I've rewritten my Ocaml program again, this time using Bigarray. Its
>memory usage is now the same as under C++, so that's good news.
>However, my program is quite ugly now, and it's actually more than
>twice as long as my C++ program. Any suggestions for simplifying this
>program? The way I initialize the "movieMajor" Array seems especially
>wonky, but I couldn't figure out a better way.
>
>
>
It's generally a good idea to back off and think about what problem
you're trying to solve.
Where Ocaml generally wins on memory utilization is using immutable data
structures and sharing data, instead of copying them. This is where a
lot of decisions Ocaml made on how to represent things suddenly make a
lot of sense, if you think in terms of data sharing. And in lots of
complicated "real" code, the memory gains made by sharing are huge
compared to the losses not incurred by not copying.
Brian