Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Really annoying change of semantics of exceptions #6683

Closed
vicuna opened this issue Nov 29, 2014 · 23 comments
Closed

Really annoying change of semantics of exceptions #6683

vicuna opened this issue Nov 29, 2014 · 23 comments

Comments

@vicuna
Copy link

vicuna commented Nov 29, 2014

Original bug ID: 6683
Reporter: @ppedrot
Status: closed (set by @xavierleroy on 2016-12-07T10:37:01Z)
Resolution: not a bug
Priority: normal
Severity: feature
Version: 4.02.0
Category: ~DO NOT USE (was: OCaml general)
Monitored by: @gasche @yakobowski

Bug description

Since OCaml 4.02 and the introduction of extensible variants, the semantics of exceptions have changed. Now, exception constructors without arguments share all their instances, e.g. Exit == Exit.

I assume this change has been made for efficiency reasons, but I may be wrong.

I know that this was not specified, but this is really, really, really annoying. This prevents some tricks that allowed to attach specific information to exceptions, like the global imperative map described somewhere by Jacques-Henri Jourdan, or the trick we use in Coq (file lib/exninfo.ml) which consists in adding a hidden field at the end of exceptions we want to enrich.

Now, we cannot do that anymore for exceptions (or open variants) without arguments, because they are neither unique by pointer nor modifiable in place. In this case, the toplevel block is no more than the unique identifier of the exception, which IS tested for pointer equality by the pattern-matching code (so changing it is a no-no, and we cannot discriminate it uniquely either).

There is also a minor change of the tag of this identifier block from tag 0 to object_tag on all exceptions, but this is not an issue.

Is is conceivable to revert this particular optimization? In particular, remove the boolean of the Cstr_extension constructor and treat all exceptions/open variants alike, meaning a dummy indirection when no arguments are given.

I know this is black magic, and I apologize in advance (unspecified, blah-blah-blah) but I think that whatever the speed gain, this optimization is not worth the loss the aforementioned tricks... Someone to prove me wrong?

@vicuna
Copy link
Author

vicuna commented Nov 29, 2014

Comment author: @xavierleroy

The public beta for OCaml 4.02.0 ended in August 2014. Had you reported this problem earlier, we might have delayed the integration of the new representation of exceptions until Coq sources were fixed.

At any rate, we will not revert to the previous representation of exceptions. The new representation has measurable performance benefits, and is perfectly correct w.r.t. the documented interfaces of OCaml.

As a major user of Coq, I am appaled that 1- Coq uses such unsafe and unspecified tricks in its implementation, and 2- it doesn't even test that they work properly.

@vicuna
Copy link
Author

vicuna commented Nov 29, 2014

Comment author: @xavierleroy

I'm relieved to see that lib/exninfo.ml is not in Coq 8.4 nor in any released version of Coq (if I'm not mistaken). So, I take back the "appalled" comment. Still, this exninfo trick has to go.

@vicuna
Copy link
Author

vicuna commented Nov 29, 2014

Comment author: @ppedrot

Well, we'll try to get rid of it before the 8.5 release... Sigh.

I agree that this was really unsafe, but one cannot deny the fact that this was a workaround for an actual use case. I believe that those use cases will get even more salient with the adoption of open variants. Indeed, one would like to be able to attach information to those objects, and it is practically unfeasible to do so with a new constructor of the form

exception Enrich of exn * data

for huge codebases, because that means that all try-with construct should be aware of the fact that exceptions may be embedded into such a constructor. It is for me a huge issue of the current state of OCaml exceptions.

In Coq in particular, there are various such enriching exceptions, and this is a real mess to handle in a sound way. Actually, there are still some bugs lying around because of this, and I was planning to extend the use of Exninfo from debugging to this sort of cleaning up.

Does anyone know of a clean way to do so? I assume we could make all the try-with aware thanks to code preprocessing, but it seems to me rather ugly. Could OCaml provide us a generic way to embed data in the open variants instead?

@vicuna
Copy link
Author

vicuna commented Nov 29, 2014

Comment author: @jhjourdan

Can't you just write :

try erroneous_code
with e -> match unwrap_exn e with
| Serious_error, info1, info2 -> blah
| e -> raise e

If you think there is too many try..with to modify, you could use a camlp4 rewriter, for example.

Anyway, are there that many try..with constructs in Coq that handle exceptions which we do not control where it will be thrown and whether it will be wrapped or not ?

@vicuna
Copy link
Author

vicuna commented Nov 29, 2014

Comment author: @ppedrot

If we want for instance to add backtraces to such exceptions, the we need to enrich all exceptions. That means that all try-with must be instrumented...

@vicuna
Copy link
Author

vicuna commented Nov 30, 2014

Comment author: nbraud

Couldn't this instumentation be automated using ppx?

@vicuna
Copy link
Author

vicuna commented Dec 2, 2014

Comment author: @ppedrot

Using a ppx-based instrumentation is indeed the cleanest way of doing it, but it does not provide a perfect abstraction. We would have to take care of cases of the form

try foo with e ->
let e = f e in
match e with
| Fail _ -> bar

by hand, handling the decapsulation of the exception in the match, not only on the try with.

As discussed with other people, could we think of using the marker object of the exception to bear more data? This would be provided in standard in OCaml. In the current implementation, instead of being a two-block object (with object_tag), it could be of arbitrary size, with fresh integers generated at runtime for each extended field.

Pattern-matching over exception would be changed into pointer equality of the first field of the marker (or the second as well, for what I understand of the uid) instead of direct marker pointer equality. That way, it would be the best of the two worlds: non-enriched exceptions would still be unique up to pointer equality (and allocation-free, i.e. fast) while we could at the same time add arbitrary data, as desired.

I really think that the issue will keep popping around, in particular because of the addition of open variants. The non-scalability of exception encapsulation would just be the same with open variants, so there is something to be done here anyway...

@vicuna
Copy link
Author

vicuna commented Dec 2, 2014

Comment author: @alainfrisch

into pointer equality of the first field of the marker

I don't think that would work (the first field is a string, and two different exception constructors can share the same string, physically). Comparing on the second field would indeed work (with one more indirection). If we keep the object_tag on the marker, extra fields would be ignored by the generic hash and comparison functions (which would be coherent with pattern-matching).

That said, I'm personally not yet convinced by the need for such a feature. If you want to attach "meta data" potentially to all exceptions, you could add an extra "proplist" argument to all exceptions defined in your project, explicitly. Are you concerned by exceptions defined outside the project, i.e. in the stdlib or whatever external library Coq depends upon? How often do they propagate and need to be matched "non-locally"? Typically, exceptions from "external code" are either captured very close to the place where they are raised, or they propagate to a generic outer driver (e.g. that logs the exception and resumes or not the operation).

@vicuna
Copy link
Author

vicuna commented Dec 2, 2014

Comment author: @ppedrot

We can indeed add an extra field to all our exceptions, which would be a little unwieldy (and ugly), but doable, as every change would be detected statically. If we want to be totally safe, this implies having to register handlers of additional info (which may result costly), or use a pointer-equality check on the last field of objects (less safe, but quicker).

The problem is indeed the externally-defined exceptions. In Coq, the most infamous anomaly is the "Anomaly: uncaught exception Not_found", which is rather well-known and ubiquitous, considering the vast quantity of code from OCaml stdlib potentially raising the Not_found exception.

It is very, very difficult to follow the flow of exceptions in a program as complex as Coq. There is an original sin, which is that the exception-handling architecture of Coq is blatantly ugly and intricate. It is true that most exceptions fall in one of your two categories. Yet, the problem is that the very moment at which a raised exception enters the second category (i.e. when it should be encapsulated) is not very clear...

@vicuna
Copy link
Author

vicuna commented Dec 2, 2014

Comment author: @gasche

If we want to be totally safe, this implies having to register handlers of additional info (which may result costly), or use a pointer-equality check on the last field of objects (less safe, but quicker).

I'm not sure what you mean here. Note that 4.02 provides an API to (rather efficiently) copy the callstack at the point of raise rather than in try..with, which could remove the need to instrument handlers -- just implement an auxiliary exception-raising function on top of raise.

Yet, the problem is that the very moment at which a raised exception enters the second category (i.e. when it should be encapsulated) is not very clear...

Could we implement an ad-hoc program analysis to detect this by inspecting the typedtree? I'm thinking of detecting all calls to functions not defined inside the Coq project itself (which would be either Camlp5 or the standard library), and ensuring a wrapper catches their exceptions.

@vicuna
Copy link
Author

vicuna commented Dec 3, 2014

Comment author: @ppedrot

I meant that you need to have a way to know that an exception accepts enriching. For instance, you can enrich

exception Foo of int * Enrich.t

but you cannot for

exception Not_found

because the former has an Enrich.t field while the latter has not. If we want to do that uniformly on exception (be it at raise time or at catching time), we need to register handler for each declared exception, as for printing of exceptions, i.e.

exception Foo of int * Enrich.t
let _ = register
begin fun data e -> match e with
| Foo (i, e) -> Foo (i, enrich data e )
|

@vicuna
Copy link
Author

vicuna commented Dec 3, 2014

Comment author: @ppedrot

(Continuation, the tab key commited my message for some strange reason)

exception Foo of int * Enrich.t
let _ = register
begin fun data e -> match e with
| Foo (i, e) -> Foo (i, enrich data e )
| _ -> raise Exit
end

which may be costly and tiresome, if we need to do that for all exceptions.

Note that 4.02 provides an API to (rather efficiently) copy the callstack at the point of raise rather than in try..with

This is not what we want: we only want the callstack up to the first try-with clause, which is, if I am not mistaken, impossible to do with the current API.

I'm thinking of detecting all calls to functions not defined inside the Coq project itself

This is only an approximation. How do you handle higher-order functions whose arguments may raise an exception? Typically, List.fold and the like...

@vicuna
Copy link
Author

vicuna commented Dec 3, 2014

Comment author: @alainfrisch

For exceptions raised by stdlib or external libraries, I'd recommend to always match them explicitly very close to the call of the function which raises them. (Or even wrap those functions to return an explicit sum/option type.) If you fail to do so because you believe an exception cannot be raised, but at runtime an exception is indeed raised and escapes (because of a logic bug in your code), well, this is a bug, and simply displaying the stack trace in a global exception handler should be enough to spot the bug. You can also adopt a more defensive style, where you'd write e.g.

try Hashtbl.find tbl key with Not_found -> assert false

everywhere you believe the exception should not be raised.

Can you give some more ideas about what you'd do with Enrich.t?

@vicuna
Copy link
Author

vicuna commented Dec 3, 2014

Comment author: @ppedrot

In Coq, we want to enrich exceptions with:

  1. Callstacks for debugging
  2. Source code locations
  3. Unique identifiers for the State Transaction Machine (introduced in v8.5)
  4. Additional information like: this exception belongs to some class, preventing them from being caught by certain handlers (we have a bunch of hackish functions to do this, but it is not extensible).

More generally, the main issue is the openness of exceptions: we want to do a posteriori open pattern-matchings.

@vicuna
Copy link
Author

vicuna commented Dec 3, 2014

Comment author: @lpw25

I've given this almost no thought at all, but it seems that most of what you want is to attach additional information to backtraces rather than exceptions, so that when you have an exception that accidentally escapes you can get more information than just the call stack.

So perhaps a more reasonable approach would be to provide some means of attaching information to the current backtrace (perhaps using an extensible variant) which can then be extracted from the backtrace after catching an exception.

With such a facility you could then replace raise with your own version that attached the required information. Then if a bug causes an exception to escape you could extract this information (along with the rest of the backtrace) when you catch the exception.

I don't really know how backtraces work, so perhaps this is not practicable to implement.

@vicuna
Copy link
Author

vicuna commented Dec 3, 2014

Comment author: @mshinwell

#6556 should probably be considered too. Ugh.

@vicuna
Copy link
Author

vicuna commented Dec 3, 2014

Comment author: @ppedrot

@lpw25:

No, it is really about exception themselves. If I am not mistaken, there are parts of Coq code relying on the fact that exceptions are wearing such additional information with them, without any try-with clause in sight. In addition, backtraces are really an imperative thing that are only alive at a given point, while enriched-exceptions are pure objects. The imperativeness of backtraces is itself a problem, because that prevents idioms such as:

try ... with e when f e -> ...

because even the call to f may destroy the backtrace (by raising and catching exceptions itself internally, typically), which deeply breaks the abstraction.
If exceptions were to wear additional information, we could simply think of an instrumenting raise adding the data to the exception once and for all, without any such fragile code.

In addition, I am rather confident that the open variants will display the very same issue, that is, the fact that open variant containers are not usable in practice, because they do not mix well with the openness of matching functions. Not even to mention the fact that they do not mix well with themselves... If you have

exception More of exn * foo
exception Plus of exn * bar

you need to prevent arbitrary towers of containers, be it alternations of More and Plus (implying a canonical order to desambiguate) or towers made only of one container. In short, exception containers are a mess.

@vicuna
Copy link
Author

vicuna commented Dec 3, 2014

Comment author: @alainfrisch

I still don't get the big picture that would justify surgery on the language/runtime. There are other big OCaml projects around, with production requirements, and the need to attach "hidden" meta-data to exceptions hasn't popped up before, so I guess they are other decent ways around.

I assume that additional meta-data is to support complex cases, where the link between the raise site and the handler site is not clear statically. Are there such cases where something clever is done on the exceptions (in addition to printing it)?

You could also have one generic exception container:

exception CoqExn of exception * proplist

val coq_raise: exn -> 'a (* actually wrapping into CoqExn, with an empty proplist, or with a proplist containing basic info such as the top N elements of the current callstack *)

(where proplist is a generic mutable container)

and ensure to never use "raise" explicitly in your code base, and to always match on CoqExn in your exception handlers. The only problem would be with exception raised outside your code base (stdlib + camlp4, something else?). At least for the stdlib, it small enough so that it shouldn't be a problem to use wrapper function that either turn exceptions into CoqExn or into options/sum types.

@vicuna
Copy link
Author

vicuna commented Dec 3, 2014

Comment author: @ppedrot

the need to attach "hidden" meta-data to exceptions hasn't popped up before

I don't know what was the use case of Jacques-Henri original imperative trick to do this, but it was surely motivated. We should ask him...

You could also have one generic exception container

At the risk of looking obsessive, there are rather serious issues with this design, which is implementable theoretically, but from my point of view, not practically.

  1. It is really tedious and bug-prone to do this by hand.
  2. If using a ppx to do the transformation automatically, this is just an approximation and special care has to be taken to ensure soundness in some (hopefully few, but unluckily statically undetectable) cases
  3. It does not scale (think of various libraries trying to enrich their exceptions their own incompatible way)
  4. And, as you remark, foreign functions may mess with the assumptions of enrichment (and in particular, higher-order functions).

Moreover, I do not find realistic to wrap all functions coming from the outside world. There are quite a few functions in OCaml stdlib, to start with, without mentioning we need to patch functors and the like!

@vicuna
Copy link
Author

vicuna commented Dec 3, 2014

Comment author: @lpw25

If it not the backtraces (i.e. the calls to raise) that you want to annotate, then I don't think the request makes sense. Exceptions are not mutable, they have no notion of identity. One Not_found is the same as another Not_found.

The code:

try
   foo ()
with Not_found -> raise Not_found

is perfectly valid and should be a no-op, but would completely ruin your "hidden metadata" regardless of how exceptions are represented.

@vicuna
Copy link
Author

vicuna commented Dec 3, 2014

Comment author: @ppedrot

I am sorry, you're actually right.

Upon code inspection, the data we attach to exceptions is only given by the raise site... So that annotating the raises should be sufficient. But this is difficult to do in a scalable (exit containers) and thread-safe fashion (exit global table).

@vicuna
Copy link
Author

vicuna commented Dec 3, 2014

Comment author: @alainfrisch

If the information is extracted from the raise site, you also know locally which exception is raised (unless you pass exceptions-to-be-raised around), so adding explicitly one argument to those exceptions should be fine. This doesn't cover exceptions raised by external code, but you cannot instrument their raise site anyway, so I don't see how you would extract the required information.

@vicuna
Copy link
Author

vicuna commented Dec 3, 2014

Comment author: @ppedrot

Alas, we do pass exceptions around a lot. There are many encapsulating functions that serve as try-finally in the codebase. It is the main reason for which the exception flow is unintelligible...

After all your helpful comments, I think I will stick to a thread-indexed, imperative, mutex-protected global table to register information for the last exception raised, mimicking the current behaviour of the get_backtrace function. This will need a bit of work to ensure that there is no memory leak due to the retained information, but I hope I will get alive of this mess.

Sorry for the annoyance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant