Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0005899OCamlOCaml standard librarypublic2013-01-22 15:192013-06-19 22:29
Reporterfrisch 
Assigned Tojacques-henri.jourdan 
PrioritynormalSeverityfeatureReproducibilityhave not tried
StatusassignedResolutionopen 
PlatformOSOS Version
Product Version 
Target VersionFixed in Version 
Summary0005899: Expose a way to inspect the current call stack
DescriptionThe machinery required to support exception traces actually makes it possible to return, at a given point in the program, (a fragment of) the call stack of the current context. This can be useful on its own. I can think of the following uses:

  - debug;

  - log the context of an error, where it is detected, but without failing;

  - run the code with exception traces disabled, for performance reasons, but manually extract the stack and put it as an argument of some specific exception, when it is raised;

  - instrument the code to perform context-sensitive profiling (how many times is this function called and from which call sites).


I attach a patch which concretely reuses the mechanism for exception traces. With more effort, it would be possible to do something cleaner. And there is no reason to put the new feature in Printexc, since it is not related to exceptions at all.
TagsNo tags attached.
Attached Filesdiff file icon current_call_stack.diff [^] (4,480 bytes) 2013-01-22 15:19 [Show Content]
patch file icon 0001-PR-0005899-exposing-a-way-to-inspect-current-call-st.patch [^] (8,509 bytes) 2013-04-22 14:07 [Show Content]
diff file icon caml_stash_current_backtrace.diff [^] (10,457 bytes) 2013-06-15 23:11 [Show Content]
diff file icon printexc_get_callstack.diff [^] (5,013 bytes) 2013-06-15 23:11 [Show Content]

- Relationships
related to 0005935acknowledged A faster version of "raise" which does not maintain the backtrace 

-  Notes
(0008779)
dbuenzli (reporter)
2013-01-22 15:48

A little bit OT but could we at the same time have the same resulting type as Printexc.current_call_stack for a variant of Printexc.get_backtrace ? I already had to parse this string more than once (notably to be able to pretty print it). Thanks.
(0008780)
frisch (developer)
2013-01-22 15:48

Daniel: agreed!
(0008781)
shinwell (developer)
2013-01-22 15:54

We already have something similar in Core (see module [Backtrace]). I was planning to clean this up; maybe we could merge these patches.
(0008782)
frisch (developer)
2013-01-22 16:01

Mark: great! Let's wait to see if there is any theoretical opposition against the proposal from other core developers. I really believe the feature deserves to be in the core system.
(0008783)
dim (developer)
2013-01-22 16:47

I tried that too for Lwt once ;-) But it wasn't enough so I didn't proposed it.

I'm also for having this.
(0009129)
jacques-henri.jourdan (manager)
2013-04-16 11:14

What I propose is the following :

1- Two functions set_backtrace_limit : option int -> unit and get_backtrace_limit : unit -> option int. set_backtrace_limit changes the mode of the backtrace recorder.
  * If called with None, we have the current behavior
  * If called with Some l, the l last stack cells will be recorded, even if they are beyond the exception handler.

This allows the client code to easily implement the current_call_stack function itself, while allowing to record a full backtrace when an exception is raised for debugging purposes.

2- Exposing the backtrace internal type of the Printexc module, and the function convert_raw_backtrace : raw_backtrace -> backtrace option, in order to make the client code able to manipulate the backtrace programatically.

I am not really in favor of 2), because what is recorded in the backtrace is not well specified. It has subtle interaction with the "-g" flag, with whether the code is executed in bytecode mode or in native mode...
(0009131)
gasche (developer)
2013-04-16 11:54
edited on: 2013-04-16 12:00

So the way to implement current_call_stack would be to raise an exception and immediately catch it back, with a longer backtrace limit?

I think that's a bit ugly, and besides you have no way to tell in your interface that you want the trace upto the beginning of the application (sure you can put max_int there, but urgh.).

On the other hand, if you had a feature to just return the N last cells (or all upto the start if None is passed), you could also express the behavior of recording "beyond the exception handler" in user-code: just use this new feature in the "with" block, that by construction will see the same cell history as the "raise" it catched (I mean "get 10 more cells higher than my try" at raise point is equivalent to "get 10 cells" at the handler point).

So while I agree that "get farther than my raise" and "get me some cells" are equivalently expressive, I think the latter is a less convoluted primitive to start with.

(0009137)
frisch (developer)
2013-04-16 13:18

> On the other hand, if you had a feature to just return the N last cells (or all upto the start if None is passed)

I agree that this is indeed more user-friendly.
(0009140)
gasche (developer)
2013-04-16 14:36

Another point about the different interface: it allows to have the number of desired cell in the function call, instead of as a global mutable variable (which was necessary with the previous model, as "raise" cannot be parametrized).
(0009146)
jacques-henri.jourdan (manager)
2013-04-16 15:54

I understand that not using a global variable seems to be a good thing. However, for debugging, using a global variable is convenient, as it does not imply modifying the handler code.

Concerning the fact that there is no way to ask for the whole trace, I am in favor of that : the user has to know that putting a big value here could be a performance problem in case of a big backtrace.
(0009147)
frisch (developer)
2013-04-16 16:38

> However, for debugging, using a global variable is convenient, as it does not imply modifying the handler code.

I don't known what kind of debugging you have in mind, but for many uses cases of the proposed feature ("get a description of the top N slots of the current call stack"), there won't even be any exception at all in the client code.
(0009150)
jacques-henri.jourdan (manager)
2013-04-16 17:18

> there won't even be any exception at all in the client code.

This is not a real problem: you can create one as in the proposed patch. I agree that for the use cases of the proposed feature, the problem does not change.

However, I see an other use case : what I have in mind is that sometimes it is not enough to see the backtrace of an exception, and having some context on where it has been catched is usefull. Both approaches can help. It can be achieved with the global variable approach by modifying the global variable, and in the gasche's approach by modifying the handler's code. My point is that while debugging, modifying a global variable is maybe more convenient than modifying a handler : there might be several interesting handlers, you have to keep track where they are...

By the way, I am thinking of an other feature that could be interesting : why not adding a record_and_raise primitive of the Printexc module, that raises an exception, recording its backtrace even if the backtrace flag is deactivated. That could be interesting, for example, when a performance critical code raises an exception signaling an error, while this code uses exceptions for control (so that you do not want to activate backtraces globally). What do you think ?
(0009151)
frisch (developer)
2013-04-16 17:32

> This is not a real problem: you can create one as in the proposed patch.

I'm not against the global variables (in Printexc) if you think it can simplify debugging existing code, but the standard library should also expose a nice API (maybe not in Printexc) to retrieve with top of the stack without a hack (changing a global value, raising a dummy exception, filtering the result, restoring the global value).

> why not adding a record_and_raise primitive of the Printexc module, that raises an exception, recording its backtrace even if the backtrace flag is deactivated.

I don't think you can easily traverse intermediate exception handlers which re-raise the exception.

> for example, when a performance critical code raises an exception signaling an error

I don't think that in practice people will segregate their code into performance-critical modules (compiled without -g) and others (compiled with -g). Tweaking compilation flags on a file-by-file basis is not realistic for large code bases. The safe assumption is rather that code running production will be entirely compiled with -g and executed with stacktrace enabled, and then it makes more sense to provide a light "raise" for those cases where performance matters.
(0009169)
jacques-henri.jourdan (manager)
2013-04-21 17:01

I uploaded a patch to implement the feature. It brings the following advantages from the original one :

- It does not use an exception in the standard library code. That is, it does not erase the exception backtrace buffer.
- It is possible to give a maximum number of frames to record.

It is still needed to provide a way to programmatically inspect raw_bactrace inhabitants. On this question, I do not know what is the best thing to do : we can either make public the "backtrace" private type of the Printexc module, or provide an interface in the style of the alain's patch.
(0009170)
gasche (developer)
2013-04-21 19:07
edited on: 2013-04-21 19:13

Call me superstitious, but I would feel better without the name change and corresponding asm-level renames. What about just keeping ..._backtrace and adding ..._backtrace_custom?

Regarding programmatic inspection, I think that the "backtrace" type inside the Printexc module should remain abstract (to allow for easy change of implementation later), with a conversion function (that may be just the identity) to a public type, that may be either the current type or Alain's simpler interface. Of course, so far we aren't exposing any of the functions manipulating this middle-end type, so the distinction between this and a fully public type does not exist yet.

(0009178)
jacques-henri.jourdan (manager)
2013-04-22 12:14

New version : keeping the names of the ASM functions, added list_of_raw_backtrace.
(0009179)
frisch (developer)
2013-04-22 12:17

I believe the new [current_backtrace] function should rather be in Pervasives since it is not related to exceptions.
(0009182)
gasche (developer)
2013-04-22 13:49

I'm worried about the (<> "printexc.ml") hack coming from Alain's original patch. What is that supposed to do (it would deserve a comment in the code at least), and won't it blow in our face in some scenarios? (You really want to trust the backtrace-reporting functions.)

Other than that, I think the documentation for current_backtrace should mention the fact that it will collecth the [n] last frames, or less if less are available. I understand the idea of not favoring the "get me all traces" scenario (though frankly I'm still not fond of having to use a high constant in this case), but the documentation should at least make clear that it is possible and won't fail instead.

Alain > moving current_backtrace to Pervasives may require a bit of work, as it currently depends an the abstract type defined in Printexc, and we can't have Pervasives depending on Printexc. Before Jacques-Henri gently does the work, are you sure this is desirable, and do you have more precise guidelines on what you want (eg. should we have raw_backtrace defined in Pervasives and re-exported in Printexc? Only in pervasives? What about an eventual exposed 'backtrace' type? Should 'record_backtrace' also move?). My gut feeling is that it's a bit perfectionist.
(0009183)
frisch (developer)
2013-04-22 14:05

> What is that supposed to do

In the original form, it was supposed to remove the top stack slots corresponding to the fake exception handler used to trigger the stack capture. (IIRC, we needed to drop 1 or 2 slots depending on whether we were running bytecode vs. native).

> are you sure this is desirable,

Desirable, yes, important no.

> we have raw_backtrace defined in Pervasives and re-exported in Printexc

If the two features share the same type, then yes, raw_backtrace could go to Pervasives. I don't think it is then required to re-export it from Printexc.

Or maybe we could introduce a new

> What about an eventual exposed 'backtrace' type

Same as raw_backtrace, I guess.

> Should 'record_backtrace' also move

This is related to exceptions (even if not to printing them), so Printexc seems fine.

An alternative could be to introduce a new stdlib module "Backtrace", where everything related to stack traces (even for exceptions) would go. Printexc would only keep the exception-printing functions (plus the one which exist in 4.00, marked as deprecated, for compatibility). But this might be overkill for a small feature, hence my suggestion to move the new things to Pervasives. But you should definitely check with Xavier!
(0009184)
jacques-henri.jourdan (manager)
2013-04-22 14:21

> In the original form, it was supposed to remove the top stack slots corresponding to the fake exception handler used to trigger the stack capture. (IIRC, we needed to drop 1 or 2 slots depending on whether we were running bytecode vs. native).

I removed it. It is no longer necessary in this context.

Concerning Printexc vs. Pervasives vs. Backtrace, I would be definitely in favor of putting it in a new Backtrace module, and this is not very difficult. But this is a rather more philosophical question.

There is another perfectionists-kind problem : now, the printer of backtraces is shared between exception backtraces and this new kind of backtraces. This causes the first line of the backtrace to be :

"Raised by primitive operation at ...."

instead of

"Called at ..."

I do not know what kind of "primitive operation" can cause this message appear in the case of an exception (array accesses ? match failures ? static raises ? ... ?), but I wonder whether it is possible to "re-raise by primitive operation", in which case the message won't be displayed correctly anyway...
(0009185)
gasche (developer)
2013-04-22 15:23

Xavier remarks that a Backtrace module would risk conflicting with userspace modules of the same name (that exist in some dark place with probability 1).

Alain > none of Jacques-Henri's recent raw_backtrace work has been released yet; as far as I know, there was no change in Printexc since 3.12, so we have no deprecation to worry about: the raw_backtrace stuff can be moved, and even renamed, if deemed necessary.


After some more reflection, I find it weird that list_of_raw_backtrace may lose information if called on a trace recorded from an exception: as the re-raise information is ignored, it is only an injection if it is only used on traces produced by current_backtrace. I think we should either have two types (error_trace and call_trace) and two printing functions, or expose a representation that can fully handle both situations (eg. the one currently used in printexc). A conceptual test for any API design would be to check that the printing function(s) can be reimplemented from the user side, using only the exposed program-friendly representation.
(0009192)
frisch (developer)
2013-04-23 11:52

> none of Jacques-Henri's recent raw_backtrace work has been released yet; as far as I know, there was no change in Printexc since 3.12, so we have no deprecation to worry about

I know, but there was already some backtrace-oriented functions in Printexc in 4.00 (like record_backtrace).

> Xavier remarks that a Backtrace module would risk conflicting with userspace modules of the same name (that exist in some dark place with probability 1).

Ah, I guess we desperately need namespaces, then. We could use OCaml_backtrace for now and hope that this namespace system will allow it to be re-mapped to just "Backtrace" in client code :-) Or just put everything in Pervasives.

Note that technically, any addition to the stdlib could break existing code (with probability < 1, I agree).
(0009356)
jacques-henri.jourdan (manager)
2013-05-27 11:20

So, what is the conclusion of this discussion ?

Should I create a new module of the stdlib, called OCaml_backtrace, that contains all the functions related to backtraces, and depreciate existing ones ?

I would rather not adding this to Pervasives, as it is expected to be used only by some specialized parts of code.
(0009386)
frisch (developer)
2013-06-03 22:37

> So, what is the conclusion of this discussion ?

I don't know. After all, keeping everything in Printexc -- even if illogical -- might not be so bad.
(0009387)
gasche (developer)
2013-06-04 07:56

I agree that Printexc is a good-enough choice. One more reason is that the security-minded people (eg. LaFoSec study) will want to disable this function as well as the others in Printexc.
(0009473)
gasche (developer)
2013-06-13 10:17

Progress note: I'm currently reviewing the patch and changing/discussing it with Jacques-Henri.
(0009506)
gasche (developer)
2013-06-15 23:28

We came up with a new patch proposal that is eager to get external reviews.

(1) patch caml_stash_current_backtrace.diff implements the runtime
primitive nece ssary for returning the current backtrace. Besides
fixing a couple issues in Jacques-Henri previous patch, we now use
a different implementation strategy: instead of using a dynamically
reallocated buffer, we first traverse the stack to compute its trace
then allocate once with the right size. The behavior of
caml_stash_backtrace, computing the backtrace of an exception, is
unchanged from the trunk implementation.

(2) patch printexc_get_callstack.diff implements the user-visible
layer in printexc.ml and printexc.mli. I think there is room for
discussion for what interface is desirable (including poignant
variable naming choices). The previous patch exposed the following
function:

 val current_backtrace: int -> raw_backtrace
 (** [Printexc.current_backtrace n] returns the backtrace of the current
     stack. The backtrace is truncated to the highest [n] frames. *)

 val list_of_raw_backtrace: raw_backtrace -> (string * int * int * int) list
  (** [Printexc.list_of_raw_backtrace bt] converts a raw backtrace to a
      list of locations. Each item is a tuple (file name, line number,
      first character, last character). *)

the current patch instead exposes the following interface:

  val get_callstack: int -> raw_backtrace
  (** [Printexc.get_raw_callstack n] returns the backtrace corresponding
      to the current call stack -- as opposed to the backtrace of the
      exception raised last. The backtrace is truncated to the
      highest [n] frames. *)

  type loc_info =
    | Known_location of bool (* is_raise *)
                      * string (* filename *)
                      * int (* line number *)
                      * int (* start char *)
                      * int (* end char *)
    | Unknown_location of bool (*is_raise*)
  (** The [loc_info] type represents the location information present in
      each slot of the backtrace. *)
  
  val loc_info_of_raw_backtrace : raw_backtrace -> loc_info array option
  (** Extracts the location information of a backtrace value. The
      location of the most recent program point (the one corresponding
      to the [raise] or call to [current_backtrace]) is at index [0].
  *)
  
  val loc_info_to_string : int -> loc_info -> string
  (** Prints location information, in a way compatible with the
      behaviour of [backtrace_to_string]. The integer parameter is meant
      to be a position in a trace, as the most recent location
      (index [0]) is printed in a specific way.
  *)

The type "loc_info" is precisely the one used in the Printexc
implementation (but note that it would be possible to change the
implementation and still expose this interface to the user, with one
conversion step). This gives the end-user more flexibility, which
I suspected Alain in particular may appreciate, but I'm not sure
whether that's actually needed.

Note that for both interfaces, the user is forced to give a maximal
length when requesting the current call trace. Jacques-Henri suggests
that this interface makes the user fully aware of performance
implications -- as opposed, for example, to giving an [int option],
with [None] meaning "until the end".



In the case of both (1) and (2), I would also welcome feedback on the
naming choices. In the latter patch I picked [get_callstack] rather
than [current_backtrace] (the idea being to use [callstack] rather
than [backtrace] to differentiate whether or not it is tied to
an exception), but many choices are possible and I'm interested to
constructive arguments. For the runtime primitive the naming is less
important, but I may change it to something more in line with the
exposed name (eg. [caml_get_callstack] if we agree on [callstack]).
(0009507)
dbuenzli (reporter)
2013-06-16 00:16

Naming.

1) Why not simply `loc` instead of `loc_info` ?

2) If 1) is adopted I'd suggest `locs_of_raw_backtrace` (note the plural) and `loc_to_string`

3) Why the `_location` suffix in the variant ? It seems redundant and longwinded. In practice your likely to write something like `match loc with Known _ -> ... | Unknown _ -> ...`.
(0009508)
gasche (developer)
2013-06-16 11:33

That is sound advice, thanks.

However, the naming of loc_info was not of my choosing: contrarily to the rest of the exposed interface that is new, this type is what was used internally in the Printexc implementation. I'll have to check that it is ok to rename it (may make life harder for people with on-top-of-the-compiler patches, etc.), and maybe decide not to change it after all.

You have probably noticed that loc_info_of_raw_backtrace is partial (may fail when no debug information is available); since I submitted the patch here, I added an explicit mention of it in the .mli documentation.
(0009528)
frisch (developer)
2013-06-17 09:43
edited on: 2013-06-17 09:45

> type loc_info = ...

I'd propose something more self-documented and more extensible. What about:

type loc_info =
  {
    is_raise: bool;
    call_site: loc option;
  }

and loc =
  {
    filename: string;
    line: int;
    start_char: int;
    end_char: int;
  }

?

(0009538)
jacques-henri.jourdan (manager)
2013-06-17 16:23

gasche: Thank you for refactoring my patch ! It seems to be pretty good work.

dbuenzli: frisch: While I agree there is some room for improvement in the naming, it would be desirable to keep the same behavior for the caml_stash_current_backtrace external. In particular, changing memory layout for the location information is not a good idea, I think. dbuenzli ideas seem OK.
(0009566)
gasche (developer)
2013-06-19 16:58

The C primitive part is now integrated into OCaml (trunk and 4.01).

Alain, could you clarify whether:

1. you're fine with a linear translation pass between the loc_info structure to a record that is more comfortable to manipulate

2. or you favor performance over usability and would rather keep the primitive data structure

3. you absolutely need both and I have to consider writing C stubs that directly go from the raw stackframe to the nice record structure without the existing intermediary OCaml datastructure (this would still be a less memory-efficient layout than the current choice, but I don't think it matters in practice)

Of course I'd rather avoid option (3).
(0009568)
frisch (developer)
2013-06-19 18:59

(3) would seem the best solution to me. I'm undecided between (1) and (2). I can imagine cases where performance would matter a little bit (e.g. an instrumentation tool capturing the top of the call stack when entering any function, to extract a call graph), but I'm not sure it's worth exposing a less friendly interface.
(0009569)
jacques-henri.jourdan (manager)
2013-06-19 19:12

I would say that (3) has a major drawback: it will break the backward compatibility of the C stub, and this won't be spotted by any type checker...
(0009572)
frisch (developer)
2013-06-19 19:44

> it will break the backward compatibility of the C stub, and this won't be spotted by any type checker...

I don't think that the runtime representation is documented. Which code would be affected?
(0009581)
gasche (developer)
2013-06-19 22:29

I wasn't thinking of changing any of the existing C stubs. We love C stubs so much that there is always room for more.

- Issue History
Date Modified Username Field Change
2013-01-22 15:19 frisch New Issue
2013-01-22 15:19 frisch File Added: current_call_stack.diff
2013-01-22 15:21 frisch Description Updated View Revisions
2013-01-22 15:48 dbuenzli Note Added: 0008779
2013-01-22 15:48 frisch Note Added: 0008780
2013-01-22 15:54 shinwell Note Added: 0008781
2013-01-22 16:01 frisch Note Added: 0008782
2013-01-22 16:47 dim Note Added: 0008783
2013-01-23 09:42 frisch Severity minor => feature
2013-04-15 11:31 frisch Assigned To => jacques-henri.jourdan
2013-04-15 11:31 frisch Status new => assigned
2013-04-16 11:14 jacques-henri.jourdan Note Added: 0009129
2013-04-16 11:54 gasche Note Added: 0009131
2013-04-16 12:00 gasche Note Edited: 0009131 View Revisions
2013-04-16 13:18 frisch Note Added: 0009137
2013-04-16 14:36 gasche Note Added: 0009140
2013-04-16 15:54 jacques-henri.jourdan Note Added: 0009146
2013-04-16 16:38 frisch Note Added: 0009147
2013-04-16 17:18 jacques-henri.jourdan Note Added: 0009150
2013-04-16 17:24 gasche Relationship added related to 0005935
2013-04-16 17:32 frisch Note Added: 0009151
2013-04-21 16:54 jacques-henri.jourdan File Added: 0001-PR-0005899-exposing-a-way-to-inspect-current-call-st.patch
2013-04-21 17:01 jacques-henri.jourdan Note Added: 0009169
2013-04-21 19:07 gasche Note Added: 0009170
2013-04-21 19:13 gasche Note Edited: 0009170 View Revisions
2013-04-22 12:12 jacques-henri.jourdan File Deleted: 0001-PR-0005899-exposing-a-way-to-inspect-current-call-st.patch
2013-04-22 12:13 jacques-henri.jourdan File Added: 0001-PR-0005899-exposing-a-way-to-inspect-current-call-st.patch
2013-04-22 12:14 jacques-henri.jourdan Note Added: 0009178
2013-04-22 12:17 frisch Note Added: 0009179
2013-04-22 13:49 gasche Note Added: 0009182
2013-04-22 14:05 frisch Note Added: 0009183
2013-04-22 14:06 jacques-henri.jourdan File Deleted: 0001-PR-0005899-exposing-a-way-to-inspect-current-call-st.patch
2013-04-22 14:07 jacques-henri.jourdan File Added: 0001-PR-0005899-exposing-a-way-to-inspect-current-call-st.patch
2013-04-22 14:21 jacques-henri.jourdan Note Added: 0009184
2013-04-22 15:23 gasche Note Added: 0009185
2013-04-23 11:52 frisch Note Added: 0009192
2013-05-27 11:20 jacques-henri.jourdan Note Added: 0009356
2013-06-03 22:37 frisch Note Added: 0009386
2013-06-04 07:56 gasche Note Added: 0009387
2013-06-13 10:17 gasche Note Added: 0009473
2013-06-15 23:11 gasche File Added: caml_stash_current_backtrace.diff
2013-06-15 23:11 gasche File Added: printexc_get_callstack.diff
2013-06-15 23:28 gasche Note Added: 0009506
2013-06-16 00:16 dbuenzli Note Added: 0009507
2013-06-16 11:33 gasche Note Added: 0009508
2013-06-17 09:43 frisch Note Added: 0009528
2013-06-17 09:45 frisch Note Edited: 0009528 View Revisions
2013-06-17 16:23 jacques-henri.jourdan Note Added: 0009538
2013-06-19 16:58 gasche Note Added: 0009566
2013-06-19 18:59 frisch Note Added: 0009568
2013-06-19 19:12 jacques-henri.jourdan Note Added: 0009569
2013-06-19 19:44 frisch Note Added: 0009572
2013-06-19 22:29 gasche Note Added: 0009581


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker