Version française
Home     About     Download     Resources     Contact us    
Browse thread
[Caml-list] OCaml Speed for Block Convolutions
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: David McClain <dmcclain1@m...>
Subject: Re: [Caml-list] OCaml Speed for Block Convolutions
> I think I'm missing something.  The ocaml loops allocate memory for temps,
but the C loops reuse storage?  Did you do a version for ocaml that's got
the same level of optimizations as the C version (not allocating, etc.)?

I didn't write an OCaml program to solve this specific problem. Instead, I
made use of my NML (numeric modeling language) built on top of OCaml. Hence,
being a general purpose language, it had to make very conservative choices
to protect the user. Whenever new data are generated, they are placed into
freshly allocated arrays.

Had I written code to specifically address this one problem it would
probably have performed better. I was quite impressed with the performance I
was getting from my NML which used this very general purpose OCaml kernel. I
almost stopped at that point. But nagging questions, eager users, and a
growing list of block convolution jobs convinced me to see what I could do
in C, while still living primarily in the NML world.

I didn't see any need to rewrite all the file I/O handling, the
unwind-protect mechanisms, etc, etc, all in C when I have perfectly good
versions of this stuff in NML thanks to OCaml. So I simply targeted the
innermost loop where I knew much of the overhead was in fresh array creation
at every intermediate step. I am doing block convolutions on GByte sized
datasets, which is far more than NML was ever intended to support well. NML
is a modeling and prototyping language.

But adding C code is as easy as adding C code to any OCaml program and
recompiling the augmented NML system. Hence targeting only the inner loops
was very simple (in principle). In practice, even after nearly 30 years of
heavy C/C++ experience I still cannot write even relatively simple programs
without planting bugs in them. To OCaml's credit, I can easily write OCaml
code that executes properly the first time out. No debugging. I wish that
were the case for C as well.

So to end this long winded explanation, had I written OCaml code
specifically for the purpose of block convolutions I would have expected a
performance around 70% of that demonstrated by C, not the factor of 4 that I
had with NML. I could have written to use mutable arrays in that case,
reaping much of the same advantages that I got when I rewrote the inner
loops in C.

- DM

----- Original Message -----
From: "Chris Hecker" <checker@d6.com>
To: "David McClain" <dmcclain1@mindspring.com>; <caml-list@inria.fr>
Sent: Tuesday, June 05, 2001 12:22 AM
Subject: Re: [Caml-list] OCaml Speed for Block Convolutions


>
> >I think the constant creation of new arrays to hold intermediate results
> >(i.e., immutable data) is costing too much. Many of the intermediate
values
> >could be overwritten in place and save a lot of time.
> >... of course... don't ask how long it took to write all this stuff in
the
> >inner loops of the convolution in C, nor how long it took to debug that
> >stuff.... And immutable data definitely helps to get the code correct.
OCaml
> >still rules!
>
> I think I'm missing something.  The ocaml loops allocate memory for temps,
but the C loops reuse storage?  Did you do a version for ocaml that's got
the same level of optimizations as the C version (not allocating, etc.)?
>
> If the code's not to big, could you post it?
>
> Chris
>
>

-------------------
To unsubscribe, mail caml-list-request@inria.fr.  Archives: http://caml.inria.fr