Version française
Home     About     Download     Resources     Contact us    
Browse thread
RE: [Caml-list] Interactive technical computing
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Jon Harrop <jon@f...>
Subject: Re: [Caml-list] Interactive technical computing
On Friday 09 March 2007 14:13, Robert Fischer wrote:
> Performance of Ocaml's bytecode is slower than F#?  Really?

Sum 1/x for x in [1 .. 10^6]. In OCaml:

time (Array.fold_left (+.) 0.)
  (Array.init 1000000 (fun i -> 1. /. float(i+1)));;

In F#:

time (Array.fold_left (+) 0.)
  (Array.map ((/) 1.) [|1. .. 1000000.|]);;

OCaml takes 0.256s, F# takes only 0.047s => F# more than 5x faster than OCaml 
bytecode.


On numerical code, F# can be faster than native-code compiled OCaml. For 
example, a naive 2^n FFT:

let fft a =
  let n = Array.length a in 
  let j = ref 0 in
  for i = 0 to n-2 do
    if i < !j then
      (let t = a.(!j) in
       a.(!j) <- a.(i);
       a.(i) <- t);
    let m = ref (n/2) in
    while !m <= !j do
      j := !j - !m;
      m := !m/2
    done;
    j := !j + !m
  done;
  let j = ref 1 in
  let b = ref (neg c1) in
  while !j<n do
    let w = ref c1 in
    for m = 0 to !j-1 do
      let i = ref m in
      while !i < n do 
	let t = !w *@ a.(!i + !j) in
	a.(!i + !j) <- a.(!i) -@ t;
	a.(!i) <- a.(!i) +@ t;
	w := !w *@ !b;
	i := !i + !j + !j
      done
    done;
    j := !j + !j;
    b := sqrt !b
  done;
  a;;

n=1<<15
ocamlopt: 0.164s
F#: 0.134s

F# is 22% faster, probably thanks to complex numbers in the language.


Discrete wavelet transform (4-tap Daubechies, n=2^20), OCaml is 25% faster:

ocamlopt: 2.03s
F#: 2.53s

Note that this is apples and oranges because my Windows environment is still 
only 32 bit. If .NET shows the same performance improvement moving from 

> From what I understand, F# has a major performance issue resulting from the
> way the .Net VM handles allocation.  Is that old info? 

Allocation is slower in F# for two main reasons:

1. The run-time is optimised for C# code that has quite different expected 
value lifetimes (far fewer very short-lived objects compared to F#).

2. F# supports concurrency, which incurs a big performance cost in allocation 
and GC.

but the consequence of this is that very allocation-heavy code (e.g. symbolic 
rewriting) is up to 4x slower in F#. Lists are also more heavyweight in F#.

However, F# regains a lot of performance by having a much faster stdlib.

For example, creating a 2^16-element set is an allocation-intensive task. In 
OCaml:

# module Int = struct
    type t = int
    let compare = compare
  end;;
# module IntSet = Set.Make(Int);;
...
# time (Array.fold_right IntSet.add (Array.init 65536 (fun i -> i)))
    IntSet.empty;;
0.744047s

Compiled with ocamlopt I get 0.065s.

For F# I get 0.240s. That's 3.7x slower than native-code OCaml.

However, it is worth noting that the F# equivalent is much more concise, just:

  time (Array.fold_right Set.add [|0 .. 65535|])
    Set.empty;;

primarily because no functors are involved when making a set. The comparison 
function is taken from the element type.

> > I've got a killer high-performance 2D and 3D visualization library
> > written in OCaml and I'd like to sell it, but I don't want to sell the
> > source code because I value it too much. What can I do? Well, I can port
> > it to F# and sell it there. In the mean time, OCaml users are stuck with
> > GNUPlot.
>
> Do you have metrics showing that performance is better with F# than OCaml
> in these two cases?

In theory, performance should be very close because so much work is done by 
the graphics card and not the CPU. In practice, I only just figured out how 
to render static geometry optimally from DirectX, so my F# version still 
sucks.

Once I've integrated that into my purely-functional scene graph I'll let you 
know what the performance is like. I expect F# to win because I'm exploiting 
concurrency and the application is more numerical than allocation intensive.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
OCaml for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists