[Camllist] Num library
[
Home
]
[ Index:
by date

by threads
]
[ Message by date: previous  next ] [ Message in thread: previous  next ] [ Thread: previous  next ]
[ Message by date: previous  next ] [ Message in thread: previous  next ] [ Thread: previous  next ]
Date:   (:) 
From:  Pierre Weis <pierre.weis@i...> 
Subject:  Re: [Camllist] Num library 
> On Thu, 10 Oct 2002, Pierre Weis wrote: > > > Once more, just one question to you: which layer of the Num library > > had you benchmarked ? > > Mmmh, I did not intend to launch a new flamewar here ... > > Maybe you can give some information yourself about the library. If I want > to manipulate only integers (no rational numbers), most of which will be > small integers, is it better to use Num or Big_int ? Would Num benefit to > be specialized to (small+big) integers (that is, throwing away the Ratio > constructor) ? Because int values and Big_int values can be distinguished > at runtime, one could imagine implementing the union "by hand" (without > explicit tagging, and thus avoiding a lot of memory allocation and > garbage collection). What would be the gain ? The Num library is fairly sophisticated and fairly well integrated into the Caml language. We still need some lexical additions to be quite confortable to input numbers, but still it is a really mature and useful library. Now, some quick info to understand why it is so difficult to bench unubounded precision arithmetic packages. Our Num library has 4 arithmetic layers:  nat: positive integers. Allocation is explicit, operations are pure side effects (result is stored in operands).  big_int: signed integers. Allocation is implicit, operations are functional.  ratio: rational numbers. Allocation is implicit, operations are functional, but normalization is a sideeffect. Normalization is either automatic or explicitely required by the user. You have to know that this sophisticated normalization discipline is just mandatory to obtain good performance with this layer: just changing the normalization regime of rational arithmetic primitives can have an exponential impact on the runtime (either systematic normalization or no normalization at all may exponentiallly win or loose depending on the computation at hand!)  num: the higer layer. Numbers are small integers, big integers, or rational numbers. Necessary coercion between those types are handled automatically by the library. The num layer is the more convenient to write programs. The nat layer is the most efficient if you just need integers. The big_int layer is generally a bit faster that the num layer, but if you want to be really efficient you need to use the nat layer. The programmation in the nat layer is a bit difficult, but the efficiency is (normally) what you gain: you end with no garbage collection at all, the numbers are allocated from the beginning of the computation with the right size to contain the final result, the operations are minimized (both in the size of operands and in the number of operations required to obtain the final result). We use to say that nat programming was the algorithmic proof level: you must state and prove your invariants while programming; a very profitable and refreshing exercice. Also the Num library is optimized for computation, not for printing. The idea being that normally you compute a lot and just print the final result. If your bench is just printing a lot of big numbers that are easily obtained with almost no computation (for instance factorial numbers), you will think the Num library to be very slow. However, we know that in the Num library the printing of big numbers is too slow and has to be revisited to go faster. As for the suggested improvement on the num type ``by hand'', some Caml compilers were (once) able to automatically do the optimization you just described (by the way not only for small ints but for the 3 constructors of the type). As you may suspect, the gain depends a lot on the application at hand. In some cases, the speedup was impressive (I remember a matrices package that almost never use big numbers, except in case of illconditioned problems: the improvement of num arithmetic versus big_int arithmetic was impressive). However, you cannot have the same arithmetic efficiency with small integers represented as num values since you need to check for potential overflows for each operation (meaning at least 5 to 10 instructions instead of 1). Also this optimization does not fit very well with the pattern matching compiler technology of the current compiler. In addition, it is not clear that it will be a big win in general (I mean out of the big number arithmetic domain, where it is clearly a win). > General question: the library seems to be extremely stable since a several > years. Does it mean you consider that it is just good as it is, or does it > mean you don't want to continue working on it ? Yes, the library is extremely stable, and this is an extremely desirable property: bugs are also extremely well chased. Concerning the development of the library, I once wrote a new implementation of the library from scratch. Unfortunately, I stopped when facing the problem of fast printing which is not so trivial. We may also consider to reimplement the Num library on top of another big numbers implementation (GMP or Numerix, if only some volonteers added assembly code for as many processors as necessary to run on all the platforms where Objective Caml is available). Best regards, Pierre Weis INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://pauillac.inria.fr/~weis/  To unsubscribe, mail camllistrequest@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/camlbugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners