English version
Accueil     À propos     Téléchargement     Ressources     Contactez-nous    

Ce site est rarement mis à jour. Pour les informations les plus récentes, rendez-vous sur le nouveau site OCaml à l'adresse ocaml.org.

Browse thread
Correct way of programming a CGI script
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2007-10-08 (21:37)
From: skaller <skaller@u...>
Subject: Re: [Caml-list] Correct way of programming a CGI script
On Mon, 2007-10-08 at 18:04 +0200, Gerd Stolpmann wrote:

> > I heard that OCaml is particularly slow (and probably
> > memory-inefficient) when it comes to string manipulation.
> No, this is nonsense. Of course, you can slow everything down by using
> strings in an inappropriate way, like
> let rec concat_list l =
>   match l with
>     [] -> ""
>   | s :: l' -> s ^ concat_list l'

Now Gerd, I would not call the claim nonsense. If you can't
use a data structure in a natural way, I'd say the claim indeed
has some weight.

The example above is ugly because it isn't tail recursive.
If you consider an imperative loop to concat the strings
in an array

	let s = ref "" in
	for i = 0 to Array.length a do
		s := !s ^ a.[i]

then Ocaml is likely to do this slowly. C++ on the other
hand will probably do this faster, especially if you reserve
enough storage first.

The poor performance of Ocaml in such situations is a result
of two factors. The first is the worst-possible choice for a
data representation: mutable characters and immutable length.
The mutability of characters has limited utility and prevents
easy functional optimisations, the useful mutability would
have to include both the content and the length (as in C++).

The second issue would probably make a functional string have
poor performance: Ocaml doesn't do any serious optimisations,
so it probably wouldn't recognize an optimisation opportunity

Note this is by design policy, it isn't a bug or laziness.
[I'm sure someone can quote a ref to Xavier's comments on this]

The effect is that you do have to make fairly low level choices
in Ocaml to get good performance. The good thing about this
is that the optimisation techniques are manifest in the
source code so you have control over them.

Felix does high level optimisations and sometimes a tiny
change in the code can cause orders of magnitude performance
differences, and when I notice it can take me (the author)
quite some time to track down what triggered the difference
in the generated code.

Now, back to the issue: in the Felix compiler itself, the
code generator is printing out C++ code. This is primarily
done with Ocaml string concatenation of exactly the kind which
one might call 'inappropriate'. Buffer is used too, but only
for aggregating large strings.

The reason, trivially, is that it is easier and clearer to write

	"class " ^ name ^ " {\n" ^ 
	catmap "\n" string_of_member members ^

than to use the imperative Buffer everywhere. The above gives more
clue to what the output looks like.

Despite the cost of using strings this way .. the compiler backend
code generator isn't a performance bottleneck.

John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net