New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Customizable number of context rows in pattern matching compilation #6913
Comments
Comment author: @gasche If you don't have a large pattern-matching somewhere in the program, it's unlikely the slowdown comes from the pattern-matching compiler. If you have a large pattern matching, you should attach it -- it's hard to pay attention without a reproducible test case. |
Comment author: dwight.guth I didn't include the file originally because it was 7 megabytes which is larger than the listed maximum attachment size. But if you are interested, I have uploaded it here: http://office.runtimeverification.com/files/def.ml You can compile it with |
Comment author: @damiendoligez Have you also benchmarked the run time of the program itself with and without the decreased constant value? I'd be interested in seeing the results. |
Comment author: dwight.guth I have not tested the code with a constant factor of 32 (the default), because even with a constant factor of 16 it takes half an hour to compile. But I compared the performance of the program with a factor of 4 and a factor of 16. The factor of 4 took 4 minutes to compile instead of half an hour, but did not seem to have a significant effect on runtime performance. Both took roughly 42-44 seconds to run total, out of which cpu profiling reports that roughly 2 seconds was spent in self time in functions with a significant amount of matching. I would have to let the program compile overnight in order to determine whether a factor of 32 causes a significant decrease in time spent in matching, but at best it will cause only a marginal improvement in performance, because of Amdahl's law. In any event, this seems to indicate that significant benefits can be reaped by being able to adjust the number of rows of context in this example. Note that the version of the program that I just uploaded will probably give you somewhat different results than I just described, because I have modified both files so that they can be built without access to the custom version of mlgmp that I wrote. If you are interested in the original versions, I can provide the original source of all three (mlgmp, def.ml, and pgm.ml) to you as well. |
This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc. |
This feature was implemented in #1786, with a |
Original bug ID: 6913
Reporter: dwight.guth
Status: acknowledged (set by @damiendoligez on 2017-03-01T12:29:27Z)
Resolution: open
Priority: normal
Severity: feature
OS: Ubuntu
OS Version: 14.04
Version: 4.02.1
Category: middle end (typedtree to clambda)
Monitored by: @gasche @maranget
Bug description
According to the paper written about the compilation of pattern matching, experiments were done to show that a balance of good compilation times and fast performance at runtime could be found by limiting the maximum number of rows of context when computing the jumps to 32 rows. However, it seems that this is not true in all cases. I have written a program (too large to attach and it is its complexity which contributes to the problem at hand) which seems to take a very substantial amount of time to compile (>15 minutes, I have not actually timed it to completion), as a result of time spent in this computation. I have tested this on the ocaml trunk, so I don't believe this is negated by the changes to pattern matching made since the 4.02.1 release. I would be interested in seeing if an advanced option to ocamlc and ocamlopt could be specified which allows the user to manually configure this threshold. I tested a change in which it was changed to a decreased constant value and it seemed to substantially decrease the amount of time spent in compilation, which would suggest that configurability of this option would be useful when users would like to be able to balance different trade-offs at different points in the development lifecycle between compilation speed and execution speed. For example, I might want relatively fast compilation times during active development, and then be willing to let the code build overnight before publishing a release.
File attachments
The text was updated successfully, but these errors were encountered: