Class compilation shows non-linear runtime behaviour. #6542

vicuna · 2014-09-04T10:25:59Z

Original bug ID: 6542
Reporter: choeger
Assigned to: @garrigue
Status: closed (set by @xavierleroy on 2017-02-16T14:18:23Z)
Resolution: fixed
Priority: normal
Severity: major
Target version: 4.03.0+dev / +beta1
Fixed in version: 4.03.0+dev / +beta1
Category: ~DO NOT USE (was: OCaml general)
Tags: compiler-time-or-space-regression
Monitored by: @jmeber

Bug description

I also asked for this on stackoverflow, but apparantly there is no simple solution:

http://stackoverflow.com/questions/25639619/performance-of-class-compilation

Basically, when compiling large classes, the process takes O(n²) amount of time for n methods:

This indicates that the compilation of a method somehow depends on the presence of other methods. This may be due to some optimization regarding the method lookup, but I am not convinced it is necessary.

Steps to reproduce

(echo "class foo = object" ; for ((i = 0; i < 2000; i++)) do echo method x$i = $i; done ; echo end) > test.ml
ocamlc test.ml

File attachments

no-post-process.patch

vicuna · 2014-09-24T14:29:08Z

Comment author: @chambart

It seams that this creates long lists of linked types that are not compressed. The majority of the run time is taken following those links in Btype.repr.

vicuna · 2015-12-02T23:20:17Z

Comment author: @alainfrisch

I think Jacques recently added compression of Tlinks. Could this have improved this case?

vicuna · 2015-12-02T23:23:59Z

Comment author: @alainfrisch

Actually, it seems this is getting worse. With current trunk, ocamlc.opt compiles test.ml in 1.39s, while it took 0.4s in a version from a few months ago.

Jacques: do you have any idea where the slowdown could come from?

vicuna · 2015-12-02T23:26:05Z

Comment author: @alainfrisch

I bump the severity, since there seems to be some quite serious compile-time degradation here.

vicuna · 2015-12-03T04:45:58Z

Comment author: @garrigue

I tested with ocamlc and only 500 methods.
I see 2 bumps: commit a335b18 of October 23 (merge gadt-warnings) doubles times (2.8s -> 5.6s), and commit 68f9634 (merge compress-Tlink) increases again (5.6s -> 6.8s).
Still have to see asymptotic behavior.
This seems to be a pathological cases, since no significant performance degradation was observed on normal code.

vicuna · 2015-12-03T05:59:02Z

Comment author: @garrigue

Actually the tests for ocamlc.opt show that the real problem is gadt-warnings, the impact of compress-Tlink being negative but small.
That's very strange, because there is no pattern-matching in this code (at least none that would require using type information for exhaustiveness!)

ocamlc.opt test.ml (1000 methods) test2.ml (2000 methods)
68f9634: 2s 10s (merge compress-Tlink)
a335b18: 1.9s 9.8s (merge gadt-warnings)
7d5a2ec: 0.4s 1.5s (just before)
4.02.2: 0.4s 1.5s

The quadratic behavior itself cannot be completely avoided, because objects are represented as linked lists of methods (same problem as for polymorphic variants). However, I admit this is pretty bad in this case.

vicuna · 2015-12-03T08:42:58Z

Comment author: @garrigue

I could track down the two causes:

for gadt-warnings, the culprit was the copying of the type to allow delaying the unused pattern check. Only do it when there are polymorphic variants in the patterns.
for compress-Tlink, the function definition was inefficient. Correct by unrolling and avoiding the forward reference.

The code is on github: #327
I shall merge it soon.

Concerning the quadratic behavior, I see no easy solution: the type of each method contains the type of the whole object, and we end up having to copy it repeatedly...
(Since we already copy it, path compression doesn't help)

vicuna · 2015-12-03T11:05:48Z

Comment author: @alainfrisch

Thanks Jacques!

Pierre noticed that:

The majority of the run time is taken following those links in Btype.repr.

I'm not sure how to interpret your comment:

(Since we already copy it, path compression doesn't help)

Is this because the copy is done before the compression?

vicuna · 2015-12-03T11:32:05Z

Comment author: @garrigue

Well, is it really the majority of the time?
If it were the case, then certainly path compression should help.

My point was just that copying can be costly, but in the new instance there are no Tlinks.

vicuna · 2015-12-03T11:35:00Z

Comment author: @garrigue

By the way, the fact that optimizing repr for the non-recursive case improves performance noticeably suggests that actually there are not many long Tlink chains.

vicuna · 2015-12-03T11:40:39Z

Comment author: choeger

I played around with this some weeks ago and came up with at least one regression due to the handling of method types. Since methods are basically handled like ordinary functions, there is some redundancy in the processing of the self-pattern. I cannot check my changes right now, but I recall that this modification was sound and provided some enhancements.

Does the attached patch (against 4.02.3) provide any insights?

vicuna · 2015-12-03T11:54:59Z

Comment author: @alainfrisch

at least one regression due to the handling of method types

Did you observe a regression between different versions (independently of those fixed by Jacques)?

vicuna · 2015-12-03T11:56:57Z

Comment author: choeger

No, at that point I focused solely on 4.02.3, I noticed however that some unification of the self-variable was invoked once per method in the post-processing (which is of course O(n²) and non-necessary since self has the same type in all methods).

vicuna · 2015-12-03T12:03:35Z

Comment author: @alainfrisch

Ok, thanks for clarifying. Proper regression between versions have higher priority, but it's certainly worth looking at your suggestion. Jacques is again the one competent to comment on it.

vicuna · 2015-12-04T00:39:59Z

Comment author: @garrigue

Concretely, which unification are you talking about?
I expect all of them to be required (there are many versions of self, whose contents could differ).
Note that unifying two physically identical types costs nothing, but unifying two full copies of the same type has a linear cost. Usually types are partially shared, so the cost lies in between.

If you're convinced some unification is useless, you can always try removing it, and see whether this improves performance, and whether the testsuite goes through.

vicuna · 2015-12-04T06:32:15Z

Comment author: @alainfrisch

Jacques: cf the attached patch. choeger mentions that it improved performance and that it looked safe (I guess here ran the testsuite?).

vicuna · 2015-12-04T08:24:55Z

Comment author: @garrigue

After fixing the badly placed end_def, which cause failure in 7 tests, this tests indeed appears to pass all the tests, and improve perfomance by a factor of 2, but I don't think this solves the quadratic problem: 2000 methods still takes almost 4 times as much as 1000.

Actually I need to correct a bit more: the begin_def/end_def pair should be completely avoided for methods.
I merged the resulting patch in the pull request.
(It would probably be better to have a special path for all cases when the pattern matching is trivial, but this requires more thinking.)

vicuna · 2015-12-04T08:34:20Z

Comment author: @alainfrisch

Thanks! We know the type-checker has non-linear behaviors all over the place, and it's good to track those that can be eliminated (without degrading common cases too much); but then significant constant-factor optimizations are also important.

vicuna · 2015-12-08T11:07:47Z

Comment author: @garrigue

The pull request is now merged: #327.

On trunk, commit 7a2a87f

vicuna closed this as completed Feb 16, 2017

vicuna added this to the 4.03.0 milestone Mar 14, 2019

vicuna assigned garrigue Mar 14, 2019

vicuna added the bug label Mar 20, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Class compilation shows non-linear runtime behaviour. #6542

Class compilation shows non-linear runtime behaviour. #6542

vicuna commented Sep 4, 2014 •

edited by nojb

vicuna commented Sep 24, 2014

vicuna commented Dec 2, 2015

vicuna commented Dec 2, 2015

vicuna commented Dec 2, 2015

vicuna commented Dec 3, 2015

vicuna commented Dec 3, 2015

vicuna commented Dec 3, 2015

vicuna commented Dec 3, 2015

vicuna commented Dec 3, 2015

vicuna commented Dec 3, 2015

vicuna commented Dec 3, 2015

vicuna commented Dec 3, 2015

vicuna commented Dec 3, 2015

vicuna commented Dec 3, 2015

vicuna commented Dec 4, 2015

vicuna commented Dec 4, 2015

vicuna commented Dec 4, 2015

vicuna commented Dec 4, 2015

vicuna commented Dec 8, 2015

Class compilation shows non-linear runtime behaviour. #6542

Class compilation shows non-linear runtime behaviour. #6542

Comments

vicuna commented Sep 4, 2014 • edited by nojb

Bug description

Steps to reproduce

File attachments

vicuna commented Sep 24, 2014

vicuna commented Dec 2, 2015

vicuna commented Dec 2, 2015

vicuna commented Dec 2, 2015

vicuna commented Dec 3, 2015

vicuna commented Dec 3, 2015

vicuna commented Dec 3, 2015

vicuna commented Dec 3, 2015

vicuna commented Dec 3, 2015

vicuna commented Dec 3, 2015

vicuna commented Dec 3, 2015

vicuna commented Dec 3, 2015

vicuna commented Dec 3, 2015

vicuna commented Dec 3, 2015

vicuna commented Dec 4, 2015

vicuna commented Dec 4, 2015

vicuna commented Dec 4, 2015

vicuna commented Dec 4, 2015

vicuna commented Dec 8, 2015

vicuna commented Sep 4, 2014 •

edited by nojb