Re: speed versus C

From: Gerd Stolpmann (Gerd.Stolpmann@darmstadt.netsurf.de)
Date: Thu Oct 07 1999 - 21:21:21 MET DST

Next message: Gerd Stolpmann: "Re: speed versus C"
Previous message: Peter Schrammel: "Problem of coercion in recursive class definitions"
Maybe in reply to: Jan Brosius: "speed versus C"
Next in thread: William Chesters: "Re: speed versus C"
Reply: William Chesters: "Re: speed versus C"
Reply: Pierre Weis: "Re: speed versus C"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

From: Gerd Stolpmann <Gerd.Stolpmann@darmstadt.netsurf.de>
To: William Chesters <williamc@dai.ed.ac.uk>
Subject: Re: speed versus C
Date: Thu, 7 Oct 1999 21:21:21 +0200
Message-Id: <99100800064700.23684@ice>

On Thu, 07 Oct 1999, William Chesters wrote:
>Gerd Stolpmann writes:
> > The basis of your argumentation is that the array solution works
> > better with current hardware and typical problem sizes. For
> > example, memcpy is very fast because it is specially supported by
> > the hardware, much faster than initializing the same number of
> > elements of a list. This is absolutely true, but I think that it
> > only demonstrates that my example was too simple.
>
>... whereas I believe that it is merely part of a much more general
>point. You almost sound like you think the efficiency of arrays
>should be discounted because it only applies "with current hardware
>and typical problem sizes". It's dangerous to get so carried away by
>abstraction that you consider exploiting well-known facts about the
>way the computer really is to be cheating.

No, I do not mean it this way. I only want to point out that there is a
relation between the efficiency of arrays in C and the architecture of
hardware. I do not see any hardware with a different kind of memory that is not
organized in rows and columns. -- C is closer to the structure of the hardware
than Caml; for example you have addresses and you can even calculate with them.
Because of this it is simpler to exploit this specific structure, and get
faster running code.

> ocaml is a great language precisely because it doesn't disdain to
>get its hands dirty---it knows very well what you have to do to solve
>real problems on real computers. For me, the kind of elegance and
>beauty you want in a language comes not from constructing castles in
>the air, but from using abstract ideas to understand the real world
>better. ocaml says "look, this is what you really mean when you write
>machine code".

I agree only partly. Most of the really "dirty" work is hidden, the programmer
does not see it. I like this, and I must praise the Caml developers that they
refuse to break holes into the abstraction (e.g. by allowing uninitialized
arrays). The point where I disagree is that I do not want to write machine
code, and that I do not recognize where Caml explains the semantics of machine
code. For example, I cannot even imagine an assembler program that uses
closures (paraphrased by machine instructions); there is always a much simpler
way to get the same effect. (On the other hand, closures are no castles in
the air, but very useful.) People want to solve real problems on real computers,
but I would stress that they want to solve PROBLEMS. It is often not so
important how much the hardware costs because software is much more expensive;
I'm currently woking for a project with much Perl and Java code, and they
simply bought a server with 12 CPUs and 4 Gig RAM. I do not want to defend
this. I like Caml because it does not waste resources, and because it shows how
cheap abstraction can be.

> Incidentally the example you give of C strings is another case
>where on reflection you find that the "unsophisticated" way is not so
>bad ...

Because it is simple to handle, and not so error-prone. But it is not the
fastest way to manage strings, and it is simple to program really slow
algorithms which are a linear factor slower than necessary. It depends on the
application which way is better. I only want to point out that the language
means which can be managed conveniently in C do not automatically lead to
efficient code, and that Caml has a chance to compete with C.

>
> > My first example looks now far better, because you have forgotten
> > to count the function calls to hide the array implementation. In
> > Caml, they are not necessary.
>
>Er, what function calls? In most Cs I can explicitly mark them
>"inline" (which I can't in ocaml (hint!!!)). Let's try and bear in
>mind that good C can be surprisingly readable.

Inlining can have a contradictory effect because the cache efficiency decreases.
It is difficult for compilers to decide whether better to inline or not. This
means that even if you inline functions there is some additional time you did
not mention.

I have done some benchmarks in the meantime: A C program collecting ten
millions integers in a growing array versus a Caml program collecting the same
amount of integers in a list. The results:

- The unoptimized C program: 1.1 sec
- The optimized C program: 0.5 sec
- The Caml program with standard GC settings: 10.3 sec
- The Caml program with best GC settings (i.e. GC never happens): 0.28 sec

This means that memory management has far more impact on the result than any
other detail discussed so far. And it is very difficult to compare memory
management of a typical libc and Caml's GC because it depends on the whole
application which effects turn out to be important and which effects are
negligible. In my benchmark the C program had an advantage because the simpler
memory management matches better the simple situation; in a complex real world
situation with fragmented memory the C program would be much slower.

> > Caml is simply not good at arrays.
>
>No, it's fine actually, that's one of the reasons why it's so great.

See the discussion about uninitialized arrays. I think Caml's arrays are ok,
but it is a bad idea to use them in the same universal way as in C. They are
rather slow when you try to implement growing buffers with them; often a
mixture of techniques (e.g. lists of arrays) is best. I like Caml because I can
combine functional and imperative structures.

> > I think it is not a good idea to adopt too much imperative style.
>
>Equally, it's not a good idea to adopt too much ref-trans style, and
>ocaml, thank heavens, supports both very well. Imperative programming
>is either better from an efficiency point of view, or stylistically
>closer to the best way of thinking about the problem, or both.

I agree, and I often program in an imperative way because of the reasons you
mention. But in Caml imperative programming should not be preferred because
the compiled code is better, but because the algorithm demands it, i.e. there
are many algorithms which work magnitudes better when programmed imperatively.
Think of the thousands of books explaining array algorithms; there are often no
functional counterparts that can compete.

I think that one of the most important advantages of Caml is that there is no
reason to avoid functional programming because of bad translation to machine
code. For many tasks, both styles are equally well; sometimes functional
style is more readable, sometimes imperative style. In some cases you need
imperative style because of theoretical complexity, and (more often) because of
better control of side-effects.

Gerd

----------------------------------------------------------------------------

The Caml program:

let rec collect l n =
  if n < 10000000 then
     collect (n::l) (n+1)
  else
     l
;;

collect [] 0;;

----------------------------------------------------------------------------

The C program:

#define INIT_SIZE 1

#include <stdlib.h>

typedef struct collection {
    int n_used;
    int n_size;
    int *a;
} collection;

void add(collection *c, int x) {
    int used, size;
    int *b;
    if ((used = c->n_used) >= c->n_size) {
        size = c->n_size;
        b = (int *) malloc((size << 1)*sizeof(int));
        if (b == NULL) exit(1);
        memcpy(b, c->a, used*sizeof(int));
        free(c->a);
        c->a = b;
        c->n_size <<= 1;
    };
    c->a[used] = x;
    c->n_used++;
}

collection *make(void) {
    collection *c;
    c = malloc(sizeof(struct collection));
    if (c == NULL) exit(1);
    c->a = malloc(INIT_SIZE * sizeof(int));
    if (c->a == NULL) exit(1);
    c->n_size = INIT_SIZE;
    c->n_used = 0;
    return c;
}

int main () {
    int i;
    collection *c;
    c = make();
    for (i=0; i<10000000; i++)
        add(c,i);
    return 0;
}

--
----------------------------------------------------------------------------
Gerd Stolpmann      Telefon: +49 6151 997705 (privat)
Viktoriastr. 100             
64293 Darmstadt     EMail:   Gerd.Stolpmann@darmstadt.netsurf.de (privat)
Germany                     
----------------------------------------------------------------------------

Next message: Gerd Stolpmann: "Re: speed versus C"
Previous message: Peter Schrammel: "Problem of coercion in recursive class definitions"
Maybe in reply to: Jan Brosius: "speed versus C"
Next in thread: William Chesters: "Re: speed versus C"
Reply: William Chesters: "Re: speed versus C"
Reply: Pierre Weis: "Re: speed versus C"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Sun Jan 02 2000 - 11:58:26 MET