Browse thread
[Benchmark] NBody
[
Home
]
[ Index:
by date
|
by threads
]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
| Date: | -- (:) |
| From: | John Carr <jfc@M...> |
| Subject: | Re: [Caml-list] NBody (one more question) |
> When I compile the C code with -O0 (with gcc -o nbody.gcc -Wall > --fast-math nbody.c -lm), I get a time of 1.513s which is comparable > to OCaml (1.607s). But as soon as I turn on -O options (as with gcc > -o nbody.gcc -Wall -O1 --fast-math nbody.c -lm), the running time > drops down to 0.871s (0.58%). Can somebody tell me what is the > optimization that has such an effect and whether it could be applied > to OCaml ? gcc -O0 sets out to generate the worst possible code, and mostly succeeds. Optimizations in gcc -O1 compared to gcc -O0 include register allocation, dead code elimination, branch straightening, common subexpression elimination, instruction combining, and instruction scheduling. ocamlopt-generated code is between -O0 and -O1 in quality, usually much closer to -O1. The biggest missing optimization is common subexpression elimination. ocamlopt puts less effort into instruction combining than gcc. gcc -O2 adds loop optimizations which ocamlopt never does. A functional programming style puts different demands on the optimizer. ocamlopt has some optimizations that don't make sense for C, e.g. replacing (unbox (box (value))) with value.