Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
compiling large file hogs RAM and takes a long time.
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2007-06-08 (01:51)
From: skaller <skaller@u...>
Subject: Re: [Caml-list] Re: compiling large file hogs RAM and takes a long time.
On Fri, 2007-06-08 at 10:02 +0900, Jacques Garrigue wrote:
> > 
> > Any chance there is some quadratic code in polymorphic variant type
> > processing?!
> There is, and this is a known problem:
> I'm sorry, but I don't see any easy way out.
> At least on the basic time complexity.

You mention in the ticket there is a hard way out .. using
binary trees; hard because it would require changes everywhere
in the compiler. Is this actually enough? Seems to reduce

	O(n * n * log n)


	O( n * log n * log n)

which is still pretty bad.. is that right?

How about a one level digital lookup, that is, use the first
8 bits of the key to choose one of 256 binary trees?

In commercial code O() performance isn't usually that relevant.
Hybrid data structure is probably best, even if the O() performance
is worse: if a one level digital lookup reduces quadratic time
by 256 that reduces a 4 hour compilation to 1 minute.. :)

Then 1,000-10,000 type constructors would be handled easily, although
around 100,000 you'd be trouble again .. but I think you'd be in
trouble anyhow if you had that many!

Also .. I suspect the OP is only using polymorphic variants to 
work around the 246 constructor limitation on non-polymorphic variants.
In this case, they could just use factored non-polymorphic variants.

the point being: polymorphic variants should be used when you need
polymorphism and are willing to pay the price. This is typically
in a compiler or some other system where there are only a moderate
number of constructors, but many ways to combine them.

So maybe the correct fix isn't to modify polymorphic variants ..
but to repair the 246 limitation on non-polymorphic variants
by automatically factoring them .. it is, after all, entirely
syntactic sugar. Of course there's be a run time penalty from
double indirection .. but that's the price you pay for extremely
fast single indirected non-polymorphic variants when the number
of constructors is small.

Hmm .. and this could 'almost' be done with camlp4.. :)

John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: