Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regression in conversion typedtree->lambda in 4.03 #7302

Closed
vicuna opened this issue Jul 22, 2016 · 9 comments
Closed

regression in conversion typedtree->lambda in 4.03 #7302

vicuna opened this issue Jul 22, 2016 · 9 comments
Assignees

Comments

@vicuna
Copy link

vicuna commented Jul 22, 2016

Original bug ID: 7302
Reporter: @sliquister
Assigned to: @lpw25
Status: closed (set by @xavierleroy on 2017-09-24T15:33:13Z)
Resolution: fixed
Priority: normal
Severity: minor
Version: 4.03.0
Fixed in version: 4.04.0 +dev / +beta1 / +beta2
Category: back end (clambda to assembly)

Bug description

ocamlopt.opt can build the program [1] using very little stack space at 4.02.3; I can't even get it to overflow with ulimit -s 0.
At 4.03, it uses at least 5MB of stack (or whatever ulimit -s 5000 means), overflowing the stack [2] for programs that used to build just fine.

Looking at the output of -dtypedtree, -dlambda, -rawlambda:
$ ls -l d*
1.6M Jul 22 15:32 dlambda1
53K Jul 22 15:33 dlambda2
2.1M Jul 22 15:37 drawlambda1
105K Jul 22 15:40 drawlambda2
80K Jul 22 15:35 dtypedtree1
15K Jul 22 15:35 dtypedtree2
1 is 4.03, 2 is 4.02.3.
The typedtree is the same (except in 4.03 it says things Tstr_modtype instead of Pstr_modtype, and the big backtrace at the end).
The lambda code, on the other hand, is much bigger with 4.03 and is probably what is making the backend blowup later. Where dlambda2 contains a bunch of aliases to modules like (global Core_kernel__Core_set!), dlambda1 seems to recurse through all the structure and alias only values at the leaves:
(let (let/1551826 =a (global Core_kernel__Core_set!))
(makeblock 0 (field 0 let/1551826)
(field 1 (global Core_kernel__Core_set!))
(field 2 let/1551826) (field 3 let/1551826)
(field 4 let/1551826) (field 5 let/1551826)
(field 6 let/1551826) (field 7 let/1551826)
(field 8 let/1551826) (field 9 let/1551826)
(field 10 let/1551826) (field 11 let/1551826)
a bunch more lines

The code of core_kernel itself changed a bit during the switch to 4.03, but I don't think it could explain this.

[1]
module type S = sig
include module type of (struct
module Core_kernel = struct
module Std =
(Core_kernel.Std : module type of struct include Core_kernel.Std end
with module In_channel := Core_kernel.Std.In_channel
with module Out_channel := Core_kernel.Std.Out_channel)
end
include Core_kernel.Std
end)
end
module Imports : S = struct
module Core_kernel = struct
module Std =
(Core_kernel.Std :
module type of struct include Core_kernel.Std end
with module In_channel := Core_kernel.Std.In_channel
with module Out_channel := Core_kernel.Std.Out_channel)
end
include Core_kernel.Std
end

[2]
Fatal error: exception Stack overflow
Called from file "map.ml", line 110, characters 18-33
Called from file "map.ml", line 117, characters 21-33
Called from file "map.ml", line 114, characters 21-33
Called from file "map.ml", line 117, characters 21-33
Called from file "map.ml", line 117, characters 21-33
Called from file "map.ml", line 117, characters 21-33
Called from file "asmcomp/CSEgen.ml", line 53, characters 32-65
Called from file "asmcomp/CSEgen.ml", line 183, characters 22-61
Called from file "asmcomp/CSEgen.ml", line 323, characters 23-66
Called from file "asmcomp/CSEgen.ml", line 331, characters 24-42
Called from file "asmcomp/CSEgen.ml", line 324, characters 29-47
Called from file "asmcomp/CSEgen.ml", line 331, characters 24-42
Called from file "asmcomp/CSEgen.ml", line 324, characters 29-47
Called from file "asmcomp/CSEgen.ml", line 331, characters 24-42
Called from file "asmcomp/CSEgen.ml", line 324, characters 29-47
Called from file "asmcomp/CSEgen.ml", line 331, characters 24-42
Called from file "asmcomp/CSEgen.ml", line 324, characters 29-47
Called from file "asmcomp/CSEgen.ml", line 331, characters 24-42
Called from file "asmcomp/CSEgen.ml", line 324, characters 29-47
Called from file "asmcomp/CSEgen.ml", line 331, characters 24-42
Called from file "asmcomp/CSEgen.ml", line 324, characters 29-47
Called from file "asmcomp/CSEgen.ml", line 331, characters 24-42
Called from file "asmcomp/CSEgen.ml", line 324, characters 29-47
Called from file "asmcomp/CSEgen.ml", line 331, characters 24-42
Called from file "asmcomp/CSEgen.ml", line 324, characters 29-47
Called from file "asmcomp/CSEgen.ml", line 331, characters 24-42
Called from file "asmcomp/CSEgen.ml", line 324, characters 29-47
Called from file "asmcomp/CSEgen.ml", line 331, characters 24-42
Called from file "asmcomp/CSEgen.ml", line 324, characters 29-47

@vicuna
Copy link
Author

vicuna commented Jul 22, 2016

Comment author: @gasche

One possibility could be the changes made by Alain to the module compilation strategy, either

"#6343, #5537, #5573: improving access to values in nested modules."
ebd01bc

or

"Fix #7067: performance regression when compiling large nested structures in native code"
9d4b3a4

If you feel motivated enough, trying to revert either commits, or checkout just before them, could be informative.

(This could also be a change related to other parts, eg. typing of module aliases or related coercions, etc.)

@vicuna
Copy link
Author

vicuna commented Jul 22, 2016

Comment author: @sliquister

The first commit seems too far down in the compiler pipeline to cause what I am seeing, but the second one is suspicious indeed.

@vicuna
Copy link
Author

vicuna commented Jul 25, 2016

Comment author: @alainfrisch

I'll look at it.

That said, I'm surprised by the compilation time in bytecode as well:

$ time ocamlfind ocamlc -package core_kernel -c foo.ml
real	0m22.360s
user	0m22.004s
sys	0m0.220s

Since bytecode is not affected by the suspected commit, I'm wondering if it is not something will module aliases which is broken. Can other confirm the very slow compilation in bytecode for the sample program? Does this time sound reasonable (perhaps it is, I've never used Core_kernel)?

@vicuna
Copy link
Author

vicuna commented Jul 25, 2016

Comment author: @alainfrisch

Compiling

module Foo = struct
  let x1 = 1
  let x2 = 2
  module A1 = struct
    let a1 = 1
    module B2 = struct
      let b2 = 2
    end
  end
end

module Bar = (Foo : module type of Foo)

gives the following lambda code (with ocamlc):

(setglobal Bar!
  (let
    (Foo/1199 =
       (let
         (x1/1200 = 1
          x2/1201 = 2
          A1/1202 =
            (let
              (a1/1203 = 1
               B2/1204 = (let (b2/1205 = 2) (makeblock 0 b2/1205)))
              (makeblock 0 a1/1203 B2/1204)))
         (makeblock 0 x1/1200 x2/1201 A1/1202)))
    (makeblock 0 Foo/1199
      (makeblock 0 (field 0 Foo/1199) (field 1 Foo/1199)
        (makeblock 0 (field 0 (field 2 Foo/1199))
          (field 1 (field 2 Foo/1199)))))))

We see that the blocks for Foo and Foo.A1 are copied. Compare with 4.02.3:

(setglobal Bar!
  (let
    (Foo/1014 =
       (let
         (x1/1008 = 1
          x2/1009 = 2
          A1/1013 =
            (let
              (a1/1010 = 1
               B2/1012 = (let (b2/1011 = 2) (makeblock 0 b2/1011)))
              (makeblock 0 a1/1010 B2/1012)))
         (makeblock 0 x1/1008 x2/1009 A1/1013)))
    (makeblock 0 Foo/1014 Foo/1014)))

The commit 9d4b3a4 might be causing the increased stack usage, but the root cause of the problem seems to be the regression above. I assume this is related to module aliases, so if someone more familiar with them could look at it, it would be helpful.

@vicuna
Copy link
Author

vicuna commented Jul 25, 2016

Comment author: @alainfrisch

The trouble is indeed that the module coercions passed to simplify_structure_coercion contain Tcoerce_alias in place of Tcoerce_none. Even:

module type S = sig
  module A1 : sig
    val a1: int
  end
end

module Foo : S = struct
  module A1 = struct
    let a1 = 1
  end
end

module Bar = (Foo : S)

illustrates the problem (without [module type of]); the coercion is:

struct [0, alias Foo.A1 (id)] [A1_1199, 0, id]

I'll let Jacques or someone else continue from there... In particular, is it ok to do as below?

diff --git a/typing/includemod.ml b/typing/includemod.ml
index a28f163..b8efdcc 100644
--- a/typing/includemod.ml
+++ b/typing/includemod.ml
@@ -197,12 +197,17 @@ and print_coercion3 ppf (i, n, c) =

 (* Simplify a structure coercion *)

+let rec is_id_coercion = function
+  | Tcoerce_none -> true
+  | Tcoerce_alias (_, cc) -> is_id_coercion cc
+  | _ -> false
+
 let simplify_structure_coercion cc id_pos_list =
   let rec is_identity_coercion pos = function
   | [] ->
       true
   | (n, c) :: rem ->
-      n = pos && c = Tcoerce_none && is_identity_coercion (pos + 1) rem in
+      n = pos && is_id_coercion c && is_identity_coercion (pos + 1) rem in
   if is_identity_coercion 0 cc
   then Tcoerce_none
   else Tcoerce_structure (cc, id_pos_list)

@vicuna
Copy link
Author

vicuna commented Jul 25, 2016

Comment author: @lpw25

That fix doesn't look right to me since it ignores the alias's path.

However, this problems looks like an instance of a problem which

#708

fixes. Certainly the coercion example I give there looks a lot like this example.

@vicuna
Copy link
Author

vicuna commented Jul 25, 2016

Comment author: @garrigue

I agree that this is typically the kind of problem Leo's patch could remedy (avoiding copying when there is no need to).
Does it help?
This is the kind of situation where it is better to progress than to revert.

@vicuna
Copy link
Author

vicuna commented Jul 25, 2016

Comment author: @lpw25

I think it fixes the problem: The output from -dlambda goes from 25,000 lines to 2,500.

@vicuna
Copy link
Author

vicuna commented Jul 28, 2016

Comment author: @gasche

Fixed by merging lpw25's #708:
#708

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants