Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0007631OCamlcompiler driverpublic2017-09-18 22:332017-09-28 23:39
Reporterpsteckler 
Assigned Togasche 
PrioritynormalSeveritycrashReproducibilityhave not tried
StatusresolvedResolutionfixed 
PlatformOSOS Version
Product Version 
Target VersionFixed in Version4.06.0 +dev/beta1/beta2/rc1 
Summary0007631: -linscan option crashes ocamlopt
DescriptionI have OPAM 4.06.0+fp+flambda installed (not available from the Mantis dropdown).

Running `make' with the attached code, I get a crash, with the stack trace:
--
$ make
ocamlopt -linscan -o runme b.ml a.ml
Fatal error: exception Invalid_argument("index out of bounds")
Raised by primitive operation at file "asmcomp/linscan.ml", line 111, characters 19-35
Called from file "list.ml", line 100, characters 12-15
Called from file "asmcomp/linscan.ml", line 115, characters 10-56
Called from file "asmcomp/linscan.ml", line 169, characters 4-28
Called from file "list.ml", line 100, characters 12-15
Called from file "asmcomp/asmgen.ml", line 87, characters 4-32
Called from file "utils/misc.ml", line 28, characters 20-27
Re-raised at file "utils/misc.ml", line 28, characters 50-57
--
Steps To Reproduce$ tar -xzf linscan.tgz
$ make
TagsNo tags attached.
Attached Filestgz file icon linscan.tgz [^] (9,930 bytes) 2017-09-18 22:33

- Relationships
child of 0007630resolvedgasche FLambda particularly slow with large file consisting in top-level trivial aliases. 

-  Notes
(0018274)
xclerc (reporter)
2017-09-19 11:34

The problem seems to be related to frame pointers:
  - opam switch 4.06.0+trunk+fp+flambda -> failure;
  - opam switch 4.06.0+trunk+fp -> failure;
  - opam switch 4.06.0+trunk+flambda -> success.
(0018277)
xclerc (reporter)
2017-09-19 13:01

Tentative fix: https://github.com/ocaml/ocaml/pull/1355 [^]
(0018280)
gasche (administrator)
2017-09-19 16:19

Xavier clerc's PR above is now merged in trunk and in the 4.06 branch. Paul, would you mind giving -linscan another try?
(0018281)
psteckler (reporter)
2017-09-19 17:24

I cloned trunk, ran configure with "-with-frame-pointer" and "-flambda" options, then "make world" and "make opt".

With the "-linscan" flag to ocamlopt, there is no crash, and the time is about 5.5 sec, much faster than the 11+ sec I saw with OPAM 4.06.0+fp+flambda when run without "-linscan". That's all good.

But -- for trunk without the "-linscan" flag, the time blows up to 42+ sec. That seems intolerable for compiling these two small files. Of that, over 36 sec is for register allocation, as given by "-dtimings". Maybe that's a new bug to file?
(0018282)
xclerc (reporter)
2017-09-19 18:48

Sorry, just to be sure, your results are:
  - 5.5 s for trunk/flambda/fp/linscan;
  - 11 s for 4.06/flambda/fp;
  - 42 s for trunk/flambda/fp.
I am surprised by the gap between 4.06 and trunk when
not using linscan.

By the way, if my code is correct, one of the interference
graphs for b.ml has more than 20K edges. So I am not sure
whether this is a bug or you just hit a "bad case" of the
algorithm.
(0018283)
ejgallego (reporter)
2017-09-19 18:49
edited on: 2017-09-19 18:51

Paul, what is the time difference without linscan between flambda and non-flambda [with standard regalloc]?

(0018284)
psteckler (reporter)
2017-09-19 20:40

Xavier, yes, those numbers are right, and that difference surprised me, too.

Emilio: For trunk+fp (no flambda), 2.4 sec without linscan, 1.2 sec with linscan.
(0018286)
ejgallego (reporter)
2017-09-19 20:52

Ok so flambda is adding 40 additional seconds to register allocation, even when -Oclassic is used.

That doesn't look right to me; there ought to be some other problematic codepath.
(0018287)
gasche (administrator)
2017-09-19 22:09

The branches "trunk" and "4.06" are virtually identical as of today, as the release branch (4.06) was branched yesterday. It is not possible to observe:

  - 11 s for 4.06/flambda/fp;
  - 42 s for trunk/flambda/fp

unless there is a measurement error.

(It may be interesting to test 4.05 as well, as that could help catching a regression in 4.06.)

Two other remarks:
- I do find it plausible that flambda would add 40s to register allocation by merging declarations into locals too aggressively; again, in the past similar blowups of the graph coloring code have been observed on some unusual (human-written) code shapes
- I'm not sure why you are systematically using the frame-pointers option for testing. It decreases performance (slightly), and (on a Linux system at least) it should not be necessary to get good debug information as we generate dwarf/cfi information that should allow to reconstruct the stack frames without a frame pointer.
(0018288)
ejgallego (reporter)
2017-09-19 22:26
edited on: 2017-09-19 22:26

Hi Gabriel, indeed the 11 vs 42 numbers seem suspicious.

> I do find it plausible that flambda would add 40s to register allocation by merging declarations into locals too aggressively.

Excuse my unfamiliarity with flambda, but should such merging happen even when using `-Oclassic` ? [I tried more esoteric options too, same results]

(0018289)
psteckler (reporter)
2017-09-19 23:24

Just to be sure I wasn't hallucinating, I ran these again.

For OPAM 4.06.0+fp+flambda:
--
$ opam switch 4.06.0+trunk+fp+flambda
# To setup the new switch in the current shell, you need to run:
eval `opam config env`
steck@felafel ~/tmp/flambda $ eval `opam config env`
steck@felafel ~/tmp/flambda $ time make
ocamlopt -o runme b.ml a.ml

real 0m11.107s
user 0m10.656s
sys 0m0.088s
--

For trunk+flambda+fp (omitting the build, install steps):

--
$ time make
ocamlopt -o runme b.ml a.ml

real 0m42.593s
user 0m42.496s
sys 0m0.068s

--

When was the OPAM package for 4.06.0+fp+flambda created? It could be significantly older than what's in Github now.

I've been using frame pointers, because I'm told it gives more information for Linux "perf". Is that not true?
(0018290)
psteckler (reporter)
2017-09-19 23:45

I downloaded the 4.05.0 sources, configured with frame-pointers and flambda (for apples-to-apples comparison):
--
$ time make
ocamlopt -o runme b.ml a.ml

real 0m43.600s
user 0m43.200s
sys 0m0.372s
--

So -- about the same as with trunk+fp+flambda.

That suggests there's something odd about the OPAM 4.06.0+fp+flambda.
(0018291)
ejgallego (reporter)
2017-09-20 00:04

Paul, in this case you may want to pull directly from OCaml's github just to be sure. Compiling OCaml is fairly easy [at least in Linux]
(0018292)
gasche (administrator)
2017-09-20 06:55

> I've been using frame pointers, because I'm told
> it gives more information for Linux "perf". Is that not true?

I think that you can get `perf` to recover stack traces from dwarf information by using

  perf record --call-graph dwarf

See

  https://ocaml.org/learn/tutorials/performance_and_profiling.html#Using-perf-on-Linux [^]
(0018293)
xclerc (reporter)
2017-09-20 10:40

Sorry to report that I cannot reproduce the problem: I get
the same timings with trunk+fp+flambda and 4.06.0+fp+flambda,
which is expected for the reasons pointed out by Gabriel.

@psteckler: the 4.06.0+fp+flambda opam compiler description
uses "https://github.com/ocaml/ocaml/archive/trunk.tar.gz" [^]
as its source. My understanding is that it is hence the latest
version from the "trunk" branch *when you install the switch*.
(0018297)
psteckler (reporter)
2017-09-20 17:07

Emilio: the trunk version I built was pulled from Github. The 4.05+fp+flambda was from the official source distribution.

Xavier: my OPAM 4.06.0+fp+flambda was built on 18 September, around 1600 US EDT.

I just built 4.06+fp+flambda from the current Github, and the time is about 43 sec, as expected.

So again, there's something funny about the OPAM version, and I think we should just disregard it.
(0018390)
psteckler (reporter)
2017-09-28 23:39
edited on: 2017-09-29 16:19

@gasches I just tried using --call-graph dwarf with perf, with a non-fp OCaml 4.05.0.

Yes, it seems to work, but the profile file is more than 20x times larger, and takes a very long time to load with "perf report".


- Issue History
Date Modified Username Field Change
2017-09-18 22:33 psteckler New Issue
2017-09-18 22:33 psteckler File Added: linscan.tgz
2017-09-19 11:34 xclerc Note Added: 0018274
2017-09-19 13:01 xclerc Note Added: 0018277
2017-09-19 16:19 gasche Note Added: 0018280
2017-09-19 16:19 gasche Status new => resolved
2017-09-19 16:19 gasche Fixed in Version => 4.06.0 +dev/beta1/beta2/rc1
2017-09-19 16:19 gasche Resolution open => fixed
2017-09-19 16:19 gasche Assigned To => gasche
2017-09-19 16:20 gasche Relationship added child of 0007630
2017-09-19 17:24 psteckler Note Added: 0018281
2017-09-19 18:48 xclerc Note Added: 0018282
2017-09-19 18:49 ejgallego Note Added: 0018283
2017-09-19 18:51 ejgallego Note Edited: 0018283 View Revisions
2017-09-19 20:40 psteckler Note Added: 0018284
2017-09-19 20:52 ejgallego Note Added: 0018286
2017-09-19 22:09 gasche Note Added: 0018287
2017-09-19 22:26 ejgallego Note Added: 0018288
2017-09-19 22:26 ejgallego Note Edited: 0018288 View Revisions
2017-09-19 23:24 psteckler Note Added: 0018289
2017-09-19 23:45 psteckler Note Added: 0018290
2017-09-20 00:04 ejgallego Note Added: 0018291
2017-09-20 06:55 gasche Note Added: 0018292
2017-09-20 10:40 xclerc Note Added: 0018293
2017-09-20 17:07 psteckler Note Added: 0018297
2017-09-28 23:39 psteckler Note Added: 0018390
2017-09-29 16:19 psteckler Note Edited: 0018390 View Revisions


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker