Version française
Home     About     Download     Resources     Contact us    
Browse thread
Looking for information regarding use of OCaml in scientific computing and simulation
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Jon Harrop <jon@f...>
Subject: Re: [Caml-list] Looking for information regarding use of OCaml in scientific computing and simulation
On Wednesday 25 November 2009 11:05:14 David MENTRE wrote:
>  - Code snippets of OCaml used in scientific computing or simulation,
> typically for advocacy like "it takes 10 lines in OCaml to do this,
> you would use 50 lines in C++ to do the same thing";

You may find the examples from OCaml for Scientists enlightening:

  http://www.ffconsultancy.com/products/ocaml_for_scientists/complete/

For example, the "n"th-nearest neighbours example expresses a simple 
recurrence relation for finding neighbour shells in networks (bonded atoms in 
simulated non-crystalline materials in my case). The idiomatic OCaml 
implementation is a dozen lines of code or so and more than fast enough. The 
obvious C++ implementation is over a hundred lines of code and a hundred 
times slower. You can optimize the C++ by resorting to mutable data 
structures and working out when it is safe to overwrite them but it is 
tedious and error-prone to beat OCaml's performance from C++.

>  - Evidence of *actual use* of OCaml for scientific computing or
> simulation, especially regarding usable libraries, bindings, etc.

We had hundreds of customers using OCaml for scientific computing, mostly in 
the biological sector but also physics, chemistry and engineering.

>  - Evidence of people having switched from C/C++ simulators to OCaml
> ones : good and bad points, issues, things to look at, etc.

If you look back at the archives of the mailing list, lots of people who've 
used OCaml for technical computing came from a C++ background, myself 
included.

OCaml has lots of huge advantages over C++:

1. Interactive top-level makes it easy to test code and perform simple 
computations.

2. Safety: incorrect OCaml code breaks in nice (usually deterministic) ways.

3. Better static checking: OCaml catches far more bugs at compile time than 
C++ does, making it much faster to develop in and easier to maintain OCaml 
code bases. Static checking is so good in OCaml that you rarely need a 
debugger. Once you've written your own HM type inferencer (only ~100 LOC) you 
can understand OCaml's error messages really easily whereas template-related 
C++ errors are a nightmare.

4. Functional programming: much easier to express functional solutions (e.g. 
combinators for integration and differentiation) and factor code elegantly 
and concisely.

5. Exceptions fast enough for control flow: OCaml's exceptions are several 
times faster than C++'s.

6. First-class array and list literals.

7. Algebraic data types with pattern matching: makes code for manipulating 
trees simple and efficient and, consequently, trees are far more common in 
OCaml than in C++.

8. Parametric polymorphism: a constrained form of templates that gives almost 
all of the useful expressiveness but without the obfuscated error messages.

9. Garbage collection: efficient automatic collection of unreachable values 
removes a class of bugs without significant performance degradation. OCaml's 
GC is nice and incremental with good pause times for visualization whereas 
C++ relies heavily upon scope-based destruction that results in avalanches, 
stalling for arbitrarily long pauses. The first OCaml project I did was a 
port of a visualization code base from C++ to OCaml and, IIRC, the OCaml was 
4x less code and with 5x shorter pause times.

10. Fast compilation: OCaml compiles orders of magnitude faster than C++. In 
fact, Google's new Go programming language is specifically designed for very 
fast compilation at a significant cost in terms of run-time performance but 
my preliminary tests indicate that OCaml both compiles and runs faster than 
Go.

11. I found OCaml far easier to learn than C++.

The main disadvantages of OCaml are:

1. No longer competitively performant on today's computers, for several 
reasons:

a) OCaml's GC prevents threads from running in parallel, making it much harder 
or impossible to obtain speedups from multicores when other languages make it 
easy to obtain near-linear speedups for many problems on today's widest 
desktops.

b) OCaml's data representation makes numerics and abstractions inefficient 
because it introduces a huge amount of boxing, e.g. complex arithmetic can be 
5x slower and polymorphism can be 100x slower compared to F#.

c) OCaml's x86 code gen has been overtaken by others (e.g. the JVM).

2. OCaml's Foreign Function Interface (FFI) is comparatively cumbersome, 
error-prone and inefficient. This makes OCaml largely uninteroperable: 
compare the development of Qt and OpenGL 2/3 bindings in OCaml to other 
languages, for example.

3. Lack of operator overloading can make numerics tedious.

4. Printing, hashing, equality, comparison and marshalling functions should be 
associated with each type and automatically used by the compiler. OCaml's 
module-based solution is tedious, error prone and inefficient.

The OCaml community has shrunk significantly in recent years (traffic here 
fell 28% from 2007-2008 and another 22% from 2008-2009, sales of OFS fell 30% 
from 2007-2008 and another 50% from 2008-2009). I believe this is because the 
OCaml community was ~80% technical users in 2006 and most of them have since 
left for languages that address the disadvantages I just listed, primarily 
parallelism. The main problem as I see it is that all of the other solutions 
for technical users outside Windows (i.e. except F#) introduce serious 
problems of their own (e.g. no TCO in Clojure and Scala, poor inference in 
Scala and Haskell, unpredictable time and space in Haskell, conservative GC 
on Mono and ATS).

The good news is that I've been working on a project (HLVM) specifically 
designed to address these problems, with high performance parallel numerics 
as well as having C-like data representation with an easy-to-use FFI and many 
other advantages including an optimizing native-code REPL, generic printing 
and serialization. This is just a hobby but I'm amazed at how easily I've 
been able to progress thanks to the awesome OCaml+LLVM combo.

>  - My colleagues are working a lot with Mathlab, is there any synergy
> there (bindings, ways to integrate Mathlab within OCaml code or vice
> versa, ...)?
>
> You can reply to me on this list or off list. In case of personal
> reply, let me know if I can reuse your name and affiliation.

Please do. :-)

> If this presentation is done, I'll make the slides available under a
> free license.

Great, thanks.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e