Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
about OcamIL
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2010-05-20 (02:03)
From: Jon Harrop <jonathandeanharrop@g...>
Subject: RE: [Caml-list] about OcamIL
Xavier Clerc wrote:
> Jon Harrop <> a écrit :
> > I don't think this is heated at all. We were talking about "high
> > performance" languages and you cited a bunch of languages that get
> whipped
> > by Python on this benchmark:
> >
> >
> >
> hash-table.
> > html
> Acknowledged.
> "Whipped" is here 2 times slower on that particular benchmark,
> while Python is rarely within an order of magnitude of OCaml code
> (cf. the language shootout).
> Moreover, hashtables are ubiquitous in Python (and hence probably
> particularly optimized), while they are not so common in Haskell
> or Caml.

I greatly value efficient generic collections.

> >> and references to benchmarks that back up your claims in this
> thread.
> >
> >
> Point taken.
> Just notice that the 17x factor is observed on the micro-benchmark,
> while on the larger one the two platforms seem on par.

Sure but those two benchmarks are testing completely different things.  The
shootout's knucleotide is a larger benchmark that still uses hash tables and
Java is still 4x slower because the JVM cannot express a generic hash table
efficiently. There are many such problems where the lack of value types on
the JVM is a serious impediment to performance.

> Here is a question about the micro-benchmark: do you know if F# do
> monomorphize the collection in this example?
> If it turns out to be done, one may probably argue that the problem
> is more related to the language than to the platform (just recycling
> an objection made on the page you pointed out).

I'm not sure what you mean here. Both of the programs are just using the
generic hash table implementation native to their platform. Neither language
does anything of relevance to optimize the code. .NET is a lot faster and
uses a lot less memory than the JVM because it stores keys and values
unboxed in the spine of the hash table because they are value types whereas
the JVM boxes them, and because the insertion function is monomorphized at
JIT time.

> > Scala on the JVM is 7x slower than C++ on this benchmark:
> >
> a&lan
> > g2=gpp
> Agreed, but it seems that if you aggregate the results of the different
> benchmarks, Scala is on average only 1.5x from C++ (but far away in
> terms
> of memory consumption). The 7x factor is observed the worst result,
> the median being 2x.

Sure. This seems to be a difference in our interpretation of "high
performance". If a language or platform can be 17x slower than others then I
would not call it "high performance" even if it is competitively performant
elsewhere. Indeed, I would completely disregard averages on benchmark suites
and focus on outliers because they convey more useful information. Granted
that was a microbenchmark but the effect is severe enough to afflict many
real programs.

> >> It may just end up that we have different perceptions of "high
> >> performance", and of the trade-offs we are going to make in our
> >> language / platform choices.
> >
> > Probably. What languages do not you not consider to be high
> > performance?
> I am not sure it is that easy to compare languages, but measuring
> compiler
> performances: any compiler that produces code that runs within -let's
> say- 5x
> of the fastest one around, on a bunch of wide-spectrum benchmarks (e.
> g.
> numerical code *plus* string matching *plus* tree walking, etc).
> Maybe it should also be mentioned that I am more versed into symbolic
> computations.

Then you're probably more interested in OCaml's GC vs parallelism when
performance is important.

> Regarding trade-offs, I am also inclined to favor Open Source solutions
> and
> higher-level languages (the trade-off being here execution time vs
> programming/debugging time).

I agree in general, of course, but I'm not sure "higher-level languages"
means much in this context. Is C# higher level than Java? Maybe, but I'm
interested in the value types and approach to generics here.

Monomorphization and unboxed tuples get you a long way towards decent
performance without a top-notch GC tuned for functional languages. That
makes it feasible to implement with a naïve multicore-capable GC, which is
precisely the current direction of HLVM.

> PS: as an aside, I used the word "references" for academic publications
> that went through a reviewing process, not blog entries.

I see. I value reproducible results far higher than peer reviewed academic
papers, particularly in this context.