Version française
Home     About     Download     Resources     Contact us    
Browse thread
Comparison of OCaml and MLton for numerics
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Mike Furr <furr@c...>
Subject: Re: [Caml-list] Comparison of OCaml and MLton for numerics
   For now,  I just want to get Jon Harrop wrote:
> On Monday 04 June 2007 15:03:45 Mike Furr wrote:
>> My OCaml summer project to build an improved data structure library
> 
> This is an absolutely incredible project and I am eagerly awaiting your 
> results. As I am sure you already know, Jean-Christophe Filliatre, Chris 
> Okasaki and Markus Mottl have done fantastic work in this area already. I 
> would be very happy to see this collated. Diego Fernandez also did some good 
> work with Baire but it seems to have evaporated.

Yes, their work is a major influence of the project.

> I'll gladly submit my stdlib2 if you're interested. Its main benefit is 
> autogenerated functions (like iter, map, fold) that act upon tuples of 
> different arities:

Great, I would love to see what you've done!  Also, if you interested, 
you can subscribe to the mailing list for the project (no posts yet as 
I'm just getting started, but perhaps we should move our conversation 
there... CC'd).

http://groups.google.com/group/osp2007-ocaml-reins

> You might also consider lazy rebalancing, as rebalancing is both expensive and 
> often unnecessary.

Initially, I plan on supporting lazy rebalancing via the use of zippers. 
   This will also allow splay-like find semantics for arbitrary tree 
structures (although not quite as efficient as a native splay tree which 
of course will also be available).

> A functional array implemented as a balanced binary tree needs the number of 
> elements (size) of a sub tree just to index into the tree. So that read-only 
> operation must indirect unnecessarily using your approach. However, the 
> indirection is a much smaller overhead compared to the cost of functors or 
> polymorphism when the comparison functions are trivial (which they normally 
> are) using the current approach.

Indeed, this example may require a custom implementation to be 
efficient.  However, I have lots to do for the summer and will likely 
leave such performance tweaking until the end of the project.

> My feeling is that the use of functors in Set and Map are more of a hindrance 
> than a benefit. The ubiquity of Hashtbl is testament to this.

I'm not exactly sure what you mean by this.  Are you suggesting the 
performance is a hindrance or that the need for a separate 'module M = 
SMap.Make(...)' line is annoying?  One goal of my library will be that 
it will allow you to change data structures very easily.  For instance 
you might write the following while initially developing your code

   module MyMap = Poly.Map

This map would use the standard polymorphic compare function with keys 
of type 'a and values of type 'b (for some a,b).  Later, once your code 
has matured, you might know your keys are always of type (int*int) list.
Then you would simply change the module def to:

  module MyMap = Mono.AVLMap(List(Pair(Int,Int)))

and now you have a specific data structure which takes advantage of the 
faster monomorphic compare function.  The library will include modules 
for all of the base types (int, float, etc...) as well as hoisting 
common type constructors into functors (such as List() above). 
Therefore, you don't have to write any annoying boilerplate code to use 
the functors.

As far as performance goes, I plan on using functors where it makes the 
code most elegant and easy to use.  With all of the mention of 
ocamldefun on this mailing list, I imagine it won't be too long before 
we have version for ocaml 3.10 (or else I may write it myself!)

>> However, this has the huge benefit of allowing me to only write *one*
>> version of find, min/max, fold, iter, etc. for all of the trees I
>> implement, which in and of itself is a big win. :)
> 
> Perhaps we should consider the alternative approach where those functions are 
> higher-order functions over the necessary low-level tree functions. This 
> would only need support for inlining to recover optimal performance and it 
> also lets you write the core functions only once.

Perhaps.  This would require passing around a dictionary of functions 
and I'm not sure the resulting code would be as clean or any faster than 
functors.  Another way would be to build them on top of the libraries 
zipper-based iterators (but this will likely be less efficient).  I plan 
on doing this anyway to support arbitrary traversal strategies.  Again, 
perhaps at the end of the summer I can look into the performance 
characteristics of these ideas.

Cheers,
-Mike