Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

at_exit functions get called twice if a callback raises and prevents earlier handlers to execute. #7253

Closed
vicuna opened this issue May 10, 2016 · 19 comments

Comments

@vicuna
Copy link

vicuna commented May 10, 2016

Original bug ID: 7253
Reporter: @dbuenzli
Status: acknowledged (set by @damiendoligez on 2016-05-10T13:34:08Z)
Resolution: open
Priority: normal
Severity: minor
Version: 4.03.0
Target version: later
Category: standard library
Tags: junior_job
Related to: #7796
Monitored by: dexterph @hcarty @dbuenzli

Bug description

The behaviour in case of an [at_exit] callback raising is a little bit suboptimal (slightly related to #7178):

  1. A raising callback prevents callbacks registered earlier to be executed.
  2. A raising callback and all the callbacks that come after get called twice.

Steps to reproduce

(* 1. prevents callbacks from running *)

# let () = at_exit (fun () -> print_endline "Once";);;
# let () = at_exit (fun () -> print_endline "Twice"; raise Exit);;
# ^D
Twice
Twice
Fatal error: exception Pervasives.Exit
...

(* 2. Double calls *) 

# let () = at_exit (fun () -> print_endline "Twice"; raise Exit);;
# let () = at_exit (fun () -> print_endline "Once";);;
# ^D
Once
Twice
Once
Twice
@vicuna
Copy link
Author

vicuna commented May 10, 2016

Comment author: @dbuenzli

Forgot to add, this is not a toplevel problem. The problem also occurs on executables.

@vicuna
Copy link
Author

vicuna commented May 10, 2016

Comment author: @dbuenzli

The fix doesn't seem too hard: the call to the callbacks in the chaining should be protected by handler here:

exit_function := (fun () -> f(); g())

The double call is likely to come from the call to Pervasives.do_at_exit by printexc.ml on unhandled exception:

(try Pervasives.do_at_exit () with _ -> ());

So if at_exit chaining properly catches exceptions and prints them I guess the double call should disappear.

@vicuna
Copy link
Author

vicuna commented Jun 1, 2016

Comment author: dexterph

The double calls do disappear, however there is no clear way to print an exception from within at_exit in Pervasives. I think a more complicated change would need to happen.

@vicuna
Copy link
Author

vicuna commented Jul 11, 2016

Comment author: @dbuenzli

So I have a fix for this which works so that each uncaught exception in at_exit functions get reported. However this slightly changes the semantics of the callback set by Printexc.set_uncaught_exception_handler function (introduced in 4.02) since it is supposed there that:

"Note that when fn is called all the functions registered with at_exit have already been called. Because of this you must make sure any output channel fn writes on is flushed."

With this fix, the "have already been called" becomes "may already have been called", since I try to call the at_exit handlers as much as possible until one raises in which case I record the raise, handle the uncaught exception and then handle the at_exit function uncaught exception using the same mechanism.

This could be one option. However I also have the feeling that this way of operating makes it less easy to understand what is happening, for example:

# let () = at_exit (fun () -> print_endline "bla");;
# let () = at_exit (fun () -> raise Exit);;                 
# let () = at_exit (fun () -> print_endline "bli");;        
# let () = at_exit (fun () -> failwith "HEY");;     
# ^D
bli
Fatal error: exception Failure("HEY")
Raised at file "pervasives.ml", line 32, characters 22-33
Called from file "pervasives.ml", line 519, characters 35-39
Called from file "pervasives.ml", line 527, characters 4-17
Called from file "toplevel/toploop.ml", line 547, characters 21-27
Called from file "toplevel/topstart.ml", line 16, characters 8-22
bla
Fatal error: exception Pervasives.Exit
Raised at file "//toplevel//", line 1, characters 34-38
Called from file "pervasives.ml", line 519, characters 35-39
Called from file "printexc.ml", line 263, characters 12-36

So here bli gets printed first because HEY gets uncaught, we try to call as much as possible the at_exit handlers, which leads bli to get called, then Exit is uncaught, so we print HEY and start the uncaught loop for Exit, which leads bla to get called and finally Exit to be printed.

Now it would be as easy and possible to execute and print this in the natural order. So my question is what was the rationale behind calling at_exit functions before the handler ? Would it be a problem to change the semantics so that:

"Note that when fn is called all functions that are registered with at_exit that have not been called yet will be called after the call to the handler" ?

@vicuna
Copy link
Author

vicuna commented Jul 12, 2016

Comment author: @dbuenzli

A fix is proposed in #685. W.r.t. to my previous message it is the version that tries to retain a natural call order w.r.t. to effects.

@vicuna
Copy link
Author

vicuna commented Dec 8, 2016

Comment author: @mshinwell

@dbuenzli: Has there been any further progress on this? #685 appears to have been closed without being merged.

@vicuna
Copy link
Author

vicuna commented Dec 8, 2016

Comment author: @dbuenzli

No. I closed it because it was not a good solution.

@vicuna
Copy link
Author

vicuna commented Dec 8, 2016

Comment author: @mshinwell

Do you want to keep this Mantis issue open or not?

@vicuna
Copy link
Author

vicuna commented Dec 8, 2016

Comment author: @dbuenzli

Well it is an issue no ?

@vicuna
Copy link
Author

vicuna commented May 21, 2018

Comment author: @xavierleroy

Alternate fix proposed here: #1790

@vicuna
Copy link
Author

vicuna commented May 23, 2018

Comment author: @xavierleroy

Merged #1790. It fixes

  • issue 2 (A raising callback and all the callbacks that come after get called twice)
  • some cases of issue 1 (if there is only one raising callback, the callbacks registered earlier will be executed; if one of those raises again, the callbacks registered earlier than that one will not run).

I'll let @dbuenzli decide whether to keep this Mantis issue open.

@github-actions
Copy link

github-actions bot commented May 9, 2020

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

@github-actions github-actions bot added the Stale label May 9, 2020
@dbuenzli
Copy link
Contributor

dbuenzli commented May 9, 2020

I still think it would be nice to have the isolation guarantee that if an at exit handler raises it doesn't prevent others from running.

@gasche
Copy link
Member

gasche commented May 9, 2020

This sounds like something that @gadmm has a lot of opinions about. We discussed related questions in the setting of statmemprof, but it is not at all easy (to me) to decide what to do.

In favor of dropping the exception and continuing with other callbacks:

  • as @dbuenzli says this provides "isolation" between users of at_exit
  • it does a better effort of best-effort callback calling, which can be important in practice if those callback are used for observable cleanup actions (writing things in a persistent log, etc.)

Against:

  • uncaught exception from a callback is a programming mistake that we don't want to ignore silently
  • there may be dependencies between callbacks, and if one failed the others may be in an invalid state (but presumably they should be written to also fail when their assumptions do not hold)
  • uncaught exceptions can be used to preemptively interrupt computation (eg. Sys.Break), so dropping an exception to continue computing may not be so good

Personally this sort of questions makes me a bit wary of using at_exit at all, especially in libraries (that typically need this sort of isolation). I would rather encourage my users to use with_* wrappers to give me control over the dynamic extend of my active usage, and more flexibility in implementing cleanup logic. But I don't have the extensive library-writing experience of @dbuenzli that may have a different perspective.

@gadmm
Copy link
Contributor

gadmm commented May 9, 2020

I agree with @dbuenzli.

@lthls you had opinions about this at #2118.

  • You mention that ideally the exception should be caught, written to the terminal, and ignored. This sounds reasonable as a better design, barring backwards-compatibility questions.
  • You mention that the raising behaviour was useful to write an “exit-catcher” to recover from libraries that call exit wrongly. Is this a use-case to preserve, having in mind backwards-compatibility and possible existence of bad code in the wild (that call exit when they should not)?

Review of Misc.try_finally usage in the compiler codebase (#2118 (comment)) has shown that at_exit is useful as an alternative to the with_ idiom because although it offers a different lifetime, although non-compositional. While calling exit and at_exit should be discouraged from library code, they serve a purpose. For this usage, though, it should be assumed that clean-up functions do not fail, so an exception arising denotes a serious error (C++ and Rust can abort in that case). For compatibility it is probably better to not abort.

For asynchronous exceptions mentioned by @gasche, it will be possible to even mask them while the handlers are running.

In my implementation of with_resource, based on @dbuenzli's and @lthls's feeback, I have been careful to implement a configurable uncaught exception handler that allows the user to decide what to do: ignore or abort, and/or log/print. We could use the same (with the same interface), although mine defaults to aborting (but this behaviour is open to discussion even in the case of with_resource).

As part of evidence-based programming language design, it would be good to audit the uses of at_exit beforehand to be sure the practice is as simple as the theory.

A bit off-topic: do not take this message as evidence that the stale bot is a good idea (although I thank @gasche for pinging me); I am a bit busy and felt pressured by the bot.

@github-actions
Copy link

github-actions bot commented Jul 1, 2022

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

@github-actions github-actions bot added the Stale label Jul 1, 2022
@gadmm
Copy link
Contributor

gadmm commented Jul 1, 2022

Not stale, but the discussion has continued at #11221 as far as I can tell.

@github-actions github-actions bot removed the Stale label Jul 4, 2022
@gasche
Copy link
Member

gasche commented Jan 12, 2023

The two repro cases that @dbuenzli used to open this PR are now working correctly. I understand from tying to re-read this discussion that some issues still exist when an exception is raised by one of the at_exit callbacks. Could we have a new repro case for this, or should we maybe consider closing the present issue and keeping #11221 instead?

@dbuenzli
Copy link
Contributor

No need to keep multiple discussion streams I'm fine with #11221. This can be closed.

(Reminds me that when I suggested on discuss to go over your own issues, I forgot about my @vicuna proxy)

@gasche gasche closed this as completed Jan 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants