English version
Accueil     À propos     Téléchargement     Ressources     Contactez-nous    

Ce site est rarement mis à jour. Pour les informations les plus récentes, rendez-vous sur le nouveau site OCaml à l'adresse ocaml.org.

Browse thread
OCamlJIT2 vs. OCamlJIT
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2010-12-05 (21:21)
From: Benedikt Meurer <benedikt.meurer@g...>
Subject: Re: ocamlopt LLVM support (Was: [Caml-list] OCamlJIT2 vs. OCamlJIT)

On Dec 5, 2010, at 21:12 , Jon Harrop wrote:

> Benedikt wrote:
>> On Dec 3, 2010, at 21:07 , Jon Harrop wrote:
>>> Benedikt wrote:
>>>> So, let's take that LLVM thing to a somewhat more serious level
>> (which
>>>> means no more arguing with toy projects that simply ignore the
>>>> difficult stuff).
>>> You can learn a lot from prototypes.
>> Indeed, but you need stay on topic. HLVM does not prototype the use of
>> LLVM within OCaml, because you're using a quite different data
>> representation.
> My point was that HLVM's data representation is far more flexible than
> ocamlopt's, so you can use it to measure the effect of changing data
> representations on the efficiency of the code LLVM generates more easily
> than with ocamlopt.
> In particular, I am saying that (from my own measurements) LLVM does not
> cope with data representations like ocamlopt's at all well. Specifically,
> when you box and cast away type information. My guess is that retargeting
> ocamlopt to LLVM IR (which is a considerable undertaking due to the amount
> of hacking required on LLVM itself) would result in more performance
> degradation that improvement. Moreover, that degradation is most likely to
> occur on code like Coq because ocamlopt has been so heavily optimized for
> that one application domain and, consequently, I doubt your hard work would
> ever be accepted into mainstream OCaml.
> Ultimately, LLVM was built specifically to exploit a typed intermediate
> representation whereas ocamlopt removes type information very early.

From what I've seen so far, LLVM makes little use of the type information, especially integer and pointer types are treated nearly the same, so LLVM won't benefit from this information. Any other data is still distinguished at the Cmm level (i.e. doubles, strings, ...), which is necessary for the register allocation and instruction selection phase in ocamlopt. You could easily change the intermediate representations in ocaml (lambda, ulambda, cmm) to include additional type information, so this is not the problem. But as said, it won't help a lot with LLVM.

Also looking at the GHC work you mentioned, they also start at the Cmm level (slightly different C--, but comparable to ocamlopt), with mostly the same amount of type information available. And as you said, they did quite well in some cases.

>> Using a different data representation within OCaml would indeed be
>> possible, but one has to ask what it's worth? You'd get better floating
>> point performance (most likely)
> And faster tuples, ints, chars, complex numbers, low-dimensional
> vectors/matrices, hash tables and so on. More types (e.g. int16 and
> float32). Even arbitrary-precision arithmetic can benefit by keeping small
> numbers unboxed when possible. Bigarrays disappear. The FFI gets simpler,
> easier to use and more powerful (e.g. you can generate AoS layouts for
> OpenGL). The benefits are enormous in the general case but that is beside
> the point here. Moreover, this would never be accepted either because it
> would degrade the performance of applications like Coq. If it is done at
> all, such work must be kept separate from the OCaml compilers.

ocamlopt already supports float32 and int16, tho they are not exploited to the upper layers (don't know why, but I also don't know why one would need them). Keeping Int32/Int64 values unboxed would be possible similar to what is done with doubles; maybe there is simply no need to do (honestly, I've never needed Int32/Int64 so far, int/Natint is usually what you want and these are optimized).

>> 3. The current garbage collector is mostly straight-forward because of
>> the data representation. No need to carry around/load type information,
> Note that conveying run-time type information also facilitates useful
> features like generic printing.

Generic printing is currently implemented using the run-time type information (the three bits mentioned below).

>> you just need the following bits of information: Is it a block in the
>> heap? Does it contain pointers? If so how many? All this information is
>> immediately available with the current data representation (also
>> improves cache locality of the GC loops).
> That information is also immediately available with HLVM's design. Each
> value contains a pointer to its run-time type. Each run-time type contains a
> pointer to a "visit" function for the GC that returns the references in any
> value of that type.

This is indeed possible, yes, and also implemented by most Java VMs; each objects ships a type-info pointer. OCaml avoids the type-info pointer using clever header field encoding. A matter of taste probably.

>> So, it is really worth to spend years on a new data representation
>> (including fixing all C bindings, etc.)? Just to get better floating
>> point performance? Integer performance will be almost the same,
>> avoiding the shift/tag bit just replaces an "addq r1, r2; subq $1, r2"
>> with "addq r1, r2"; even doing this thousands of times will not cause a
>> noticeable difference.
> On the contrary, there are many example of huge performance gains other than
> floating point. The int-based random number generator from the SciMark2
> benchmark is 6x faster with LLVM than with OCaml. Generic hash tables are
> 17x faster in F# than Java because of value types. And so on.

I doubt that this is really related to the use of "value types", I also doubt that the integer performance improvements are related to the missing "subq" (must have been an unusual CPU then). The Java performance is probably related to the GC in most JVMs, which is not as well optimized for high speed allocation and compaction as the GCs found in most runtimes of functional programming languages. But this is just guessing, we'd need some facts to be sure what's the cause. I suggest you present some proof-of-concept code in C or LLVM, simply using malloc() for memory allocation, comparing a straight-forward implementation to a "value type" implementation (with everything else the same). If there's still a 17x improvement, I promise to rewrite the whole OCaml system during the next three months to support value types.

> Cheers,
> Jon.