Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lifting allocation of constant exception constructors? #6203

Closed
vicuna opened this issue Oct 4, 2013 · 15 comments
Closed

Lifting allocation of constant exception constructors? #6203

vicuna opened this issue Oct 4, 2013 · 15 comments
Assignees

Comments

@vicuna
Copy link

vicuna commented Oct 4, 2013

Original bug ID: 6203
Reporter: @alainfrisch
Assigned to: @alainfrisch
Status: closed (set by @xavierleroy on 2015-12-11T18:25:15Z)
Resolution: fixed
Priority: normal
Severity: minor
Fixed in version: 4.02.0+dev
Category: back end (clambda to assembly)
Related to: #5935
Monitored by: @gasche @diml @jmeber @yakobowski

Bug description

Constant exception constructors (Not_found, Exit) are quite common. It seems that any reference to such a constructor allocates a fresh exception value. Would it break something to pre-allocate such values (either by exporting them from the declaring module as an extra value, in addition to the exception slot itself; or by lifting their allocation at the top of the unit where they are used)? Or is the physical identity of exception values somehow used?

@vicuna
Copy link
Author

vicuna commented Oct 4, 2013

Comment author: @garrigue

Yes it would.
The semantics of exceptions is generative.
See the blog post by Bob Harper on why they should be so:
http://existentialtype.wordpress.com/2012/12/03/exceptions-are-shared-secrets/

By the way he is wrong about ocaml: in ocaml functors are applicative,
but exceptions are generative.

@vicuna
Copy link
Author

vicuna commented Oct 4, 2013

Comment author: @gasche

Pierre Chambart ongoing work on a big ball of inlining and value analysis (e.g. http://www.ocamlpro.com/blog/2013/07/11/inlining-progress-report.html ) has gotten some noticeable performance improvements from not allocating exceptions when possible (I hope that his changes are correct, but he may have some exception-specific transformations that make implicit semantic assumptions).

Alain, if you are interested in this specific optimization, it may be interesting to contact Pierre and maybe start review some of his changes.

@vicuna
Copy link
Author

vicuna commented Oct 4, 2013

Comment author: @lpw25

I'm a little confused here: I thought that creating a new exception constructor needed to be generative (as discussed in Bob Harper's blog), but that creating a particular reference to a constructor did not need to be generative.

In other words,

Not_found == Not_found

could safely return true.

This seems to be what Alain is talking about in this issue. I'm actually quite surprised that this is not already the case (it is true for my open types work).

@vicuna
Copy link
Author

vicuna commented Oct 4, 2013

Comment author: @alainfrisch

I can only concur with Leo's comment (and I was also surprised): while the exception declaration needs to be generative, I don't see why the application of a constant exception constructor should be.

Jacques: how would you observe the difference between

let not_found = Not_found

let f x = .... raise not_found

and

let f x = .... raise Not_found

without explicitly using (==) on exception values ?

@vicuna
Copy link
Author

vicuna commented Oct 4, 2013

Comment author: @garrigue

OK, I got confused with your naming it "constructor".
For me the real constructor is that path to the shared value.
And I learnt the hard way that this is a value that you must retrieve (new it, but...)

Right, it seems that for constant exceptions the code allocates a 1-sized block containing this pointer.
It should be safe to share this part.
Lifting it to the top-level of a module should be fine, IF the value pointer is a global one.

@vicuna
Copy link
Author

vicuna commented Oct 5, 2013

Comment author: @lpw25

In my open types work (where I am free to chose the representation) I represent constant constructors as just a pointer to the "constructor block", while other constructors are represented as blocks whose first field is a pointer to their "constructor block".

This works because I ensure that the first field of all constructor blocks is 0, so it is safe to compare a constructor block with a normal constructor's block.

I also tag the constructor blocks with the object tag and put a fresh id in their second field. This means that structural comparison works correctly on them (unlike exceptions: #4765).

I think it would be a definite improvement to change the exception representation to this, although I don't know if it would break legacy C bindings.

@vicuna
Copy link
Author

vicuna commented Oct 7, 2013

Comment author: @alainfrisch

Physical identity of exception values is indeed used in the backtrace mechanism, see backtrace.c:

void caml_stash_backtrace(value exn, code_t pc, value * sp)
{
code_t end_code = (code_t) ((char *) caml_start_code + caml_code_size);
if (pc != NULL) pc = pc - 1;
if (exn != caml_backtrace_last_exn) {
caml_backtrace_pos = 0;
caml_backtrace_last_exn = exn;
}
....

This is used to coalesce stack fragments upon re-raise of the caught exception.

@vicuna
Copy link
Author

vicuna commented Oct 7, 2013

Comment author: @gasche

Indeed, but I don't think sharing exception values would break any contract with users of the backtrace mechanism. My understanding of re-raising is that it's meant to improve readability (by trashing less backtraces), and that detecting more opportunity for re-raising would still be correct. I would consider it only a (tweakable) implementation detail that

try e with Not_found as exn -> raise exn

and

try e with Not_found -> raise Not_found

currently don't behave in the same way with respect to re-raising.

@vicuna
Copy link
Author

vicuna commented Oct 10, 2013

Comment author: @alainfrisch

I'm more concerned about situations where really unrelated exceptions (same generic constructor, raised from unrelated locations) are coalesced in a single meaningless trace.

@vicuna
Copy link
Author

vicuna commented Oct 11, 2013

Comment author: @alainfrisch

Here is an example:

let x = Exit

let f () = raise x
let g () = raise x

let () =
(try f () with _ -> prerr_endline (Printexc.get_backtrace ()));
(try g () with _ -> prerr_endline (Printexc.get_backtrace ()));

It prints:

========================================================
Raised at file "exn.ml", line 3, characters 17-18
Called from file "exn.ml", line 7, characters 7-11

Raised at file "exn.ml", line 3, characters 17-18
Called from file "exn.ml", line 7, characters 7-11
Re-raised at file "exn.ml", line 4, characters 17-18
Called from file "exn.ml", line 8, characters 7-11

Clearly, users would not expect the first trace to be kept for the second exception.

The current situation with constant exceptions ensures that this never happens as long as the raise is on explicit exception constructors.

I propose to change the strategy used to detect "new traces". Basically, we need to know if a raise operation is a "re-raise" (--> add to the current trace) or a fresh raise (--> start with a fresh trace). If we can distinguish them, we don't need to rely on the physical equality of exception value to decide. For re-raise inserted by the compiler (in handlers which don't match the current exception) this is easy. For "user re-raises", we could have some basic heuristics in the compiler, detecting calls to "raise" where the argument comes directly from the inner most handler pattern:

try...with exn -> .... raise exn ....

or:

try...with (Foo | Bar) as exn -> .... raise exn ....

Another option is to expose an intentional "reraise" function directly. Code would need to be adapted to use it, but a compiler warning (using the same heuristics as above) could help. I think I have a preference for the automatic heuristic to detect re-raise, though.

Is anyone against the idea of keeping track of which raises are actually re-raises, and use this information instead of physical equality of exception values in the trace mechanism? (Actually, we probably need to check both the "re-raise" flag and the physical equality, in case the trace has been destroyed by some other piece of code.)

@vicuna
Copy link
Author

vicuna commented Oct 11, 2013

Comment author: @lpw25

How about supporting re-raising of different exceptions with messages like:

Raised "Foo" at file "exn.ml", line 3, characters 17-18
Called from file "exn.ml", line 7, characters 7-11
Re-raised as "Bar" at file "exn.ml", line 4, characters 17-18
Called from file "exn.ml", line 8, characters 7-11

Then (in addition to providing an explicit re-raise function) we could simply treat all raises that are statically within a handler as re-raises.

So:

try
raise Foo
with Foo -> raise Bar

is a re-raise, but:

let f () =
...;
raise Bar

try
raise Foo
with Foo -> f ()

isn't.

This heuristic is very predictable for users, and the ability to track re-raises of different exceptions can be very useful for debugging some styles of code (e.g. parts of the compiler itself).

@vicuna
Copy link
Author

vicuna commented Oct 14, 2013

Comment author: @alainfrisch

  • Regarding re-raising of different exceptions: why not, but I'm tempted to leave it to the programmer to manipulate such traces. Now that we provide structured access to the trace, it is possible to manually capture the first segment and insert it in the second segment explicitly (as an extra argument of the second exception).

  • I've created a branch "raise_variants" in the SVN, supporting an internal "reraise" primitive. It is used by the compiler for reraising an exception when the handler does not catch it, and also in the following forms:

    try ... with id -> ... raise id ...
    try ... with ... as id -> ... raise id ...

"reraise" is not exposed as a user-accessible function for now (although it is trivial to do so, in order to support abstracting the handler code).

With this change, the "problematic" example above works fine (the two segments are not merged). One can of course build example where it breaks traces, but I don't expect this to be common, and on the other hand, it reduces the risk of getting bad traces and enable the optimization suggested in this ticket (no allocation on references to constant constructors).

The branch also adds a "raise_notrace" function (see #5935).

@vicuna
Copy link
Author

vicuna commented Oct 18, 2013

Comment author: @alainfrisch

Commits 14235, 14236 on the "raise_variants" branch changes the representation of exceptions values built from constant exception constructors. The new representation is just the exception slot itself (i.e. the "string ref").

Here are the results for the following micro-benchmark (with ocamlopt, without -g, without OCAMLRUNPARAM=b=1):

let () =
let h = Hashtbl.create 1024 in
for i = 1 to 100000000 do
try Hashtbl.find h i
with Not_found -> ()
done

trunk: 2.52s
branch: 2.28s

The difference cannot be observed with backtraces enabled, though, even if the version from the branch does not allocate at all (contrary to trunk).

@vicuna
Copy link
Author

vicuna commented Oct 18, 2013

Comment author: @alainfrisch

One point I did not consider is how the backtrace mechanism behaves in a multi-threaded application. I don't know what's the current behavior, but it is possible that the proposal (marking reraise statically to allow sharing constant exception constructor) does not play nice with threads.

@vicuna
Copy link
Author

vicuna commented Nov 13, 2013

Comment author: @alainfrisch

raise_variants branch has been merged into trunk (rev 14289).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants