Version française
Home     About     Download     Resources     Contact us    
Browse thread
OCaml efficiency/optimization?
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Brian Hurt <bhurt@s...>
Subject: Re: [Caml-list] Re: OCaml efficiency/optimization?


On Fri, 28 Oct 2005, Tato Thetza wrote:

> sorry I meant
> Ocaml for Experienced Programmers:
> http://www.bogonomicon.com/bblog/book/html/book1.html
>
> On Fri, 28 Oct 2005 03:21:43 -0700, "Tato Thetza" <thetza@sent.com>
> said:
>> I've been reading over
>> http://caml.inria.fr/pub/docs/manual-ocaml/index.html and have learned
>> two things:
>> -lists are immutable and singly linked, which explains why 1::[2;3] is
>> valid while [2,3]::1 is not, and why its efficient.
>> -the proper way to ensure tail-recursive optimization
>>
>> question: are these and other optimizations documented somewhere
>> officially? I find it a little uncomfortable I've been learning OCaml
>> without knowning such internal details. Any secrets I should definitely
>> know if I were to use this language in production?

Answering the original question (somewhat belatedly), I present Brian's 
steps to optimizing code:

1) Get it to work.  I don't care how fast your program returns the wrong 
answer, get it returning the right answer in all cases before doing 
anything else.  This also includes making the code maintainable- if you 
can't figure out how the code works, you won't be able to figure out how 
to make the code work faster.  And in general, optimizers work better on 
clean, maintainable code- generally maintainable code is also faster code.

2) Measure.  Is the code fast enough?  If so, you can stop now.

3) Optimize.  This includes compiling to native, and using ocamldefun, 
the Ocaml Defunctorizor.  This is stuff that is "easy" to do 
(comparatively), and can have dramatic effects on performance.

4) Buy a faster computer.  No, really- what's the programmer time worth? 
>From here on out it'll start being really easy to rack of tens, if not 
hundreds, of thousands of dollars of programmer time improving the code. 
At which point simply buying a faster machine might be signifigantly 
cheaper.  This isn't always the case, but generally it is- you definately 
shouldn't assume it isn't the case.

5) Profile.  If the code isn't fast enough, what's taking the time? 
Making a program 10x faster can be completely worthless- if the program 
takes 100ms to fire off a database query that takes 15 minutes to 
complete, making that program only take 10ms to fire off the same database 
query isn't going to help.  Also, optimizing initilization loops is 
generally not usefull.  Where is the time going?

6) Look for large-scale optimizations.  Are you doing things you don't 
need to do?  No operation completes faster than the operation you don't 
do.  What is the big-O notation of your core algorithms?  Replacing an 
O(n^2) algorithm with an O(n log n) algorithm can yield huge performance 
gains (10,000x or more).  This is where maintainable code comes in real 
handy.  This is also the point where I consider adding imperitive code to 
my program, replacing O(log N) tree lookups with O(1) array lookups.  Note 
that this is an *optimization*- applicative datastructures are more 
maintainable (IMHO), and thus should be the default approach taken until 
it is proven that performance requirements demand imperitive data 
structures.  But I'd definately go through Okasaki's book before making 
that decision.

One comment here: implicit in this step is you being conversant with the 
current literature- which generally means doing new searches on a regular 
basis.  You might think "Hey, I've been using red-black trees for decades, 
and it's not like they're exactly new technology.  Why do a literature 
search on them?"  But you'd miss this paper, which was only published six 
years ago:
http://citeseer.ist.psu.edu/hinze99constructing.html

7) Measure again.  Actually, I take that back- you should be measuring 
constantly (to show that you are, in fact, making things go faster), and 
profiling regularly, from here on out.

8) Now you're into the tweak stage.  Generally (not always, but 
generally), using recursive functions and arguments, if it's less than 
five arguments, will be faster than loops and references (the arguments 
will just live in registers, while the reference assignments have to get 
written out to memory).  Replace tuples of all floats with structures of 
all floats, and unbox the floats.

9) If it still isn't fast enough, rewrite core routines in C, or even 
assembly language.  But these are Hail Mary passes of optimization- if 
they aren't enough, you're screwed.

Brian