Pattern matching with mutable and lazy patterns is unsound #7241

vicuna · 2016-04-25T17:13:49Z

Original bug ID: 7241
Reporter: @stedolan
Assigned to: @maranget
Status: assigned (set by @mshinwell on 2017-03-09T12:39:53Z)
Resolution: open
Priority: normal
Severity: crash
Category: typing
Related to: #5992
Monitored by: junsli braibant @gasche @hcarty

Bug description

Optimised pattern matching skips checking conditions that seem redundant. However, since OCaml supports pattern-matching mutable fields, and code execution during matching via "lazy" patterns, the truth of some conditions can vary during matching.

This can cause seemingly-impossible cases to be taken, if forcing lazy values causes mutations that confuse the optimised matching logic. Due to the presence of GADTs, taking an impossible case is a soundness issue.

For example, this program segfaults:

type (_, _) eq = Refl : ('a, 'a) eq
type (_, _) deq = Deq : ('a, 'x) eq option * ('x, 'b) eq option -> ('a, 'b) deq

let deq1 = Deq (Some Refl, None)
let deq2 = Deq (None, Some Refl)

type ('a, 'b) t = { 
  a : bool; 
  mutable b : ('a, 'b) deq;
  mutable c : int Lazy.t
}

let g (type a) (type b) : (a, b) t -> (a, b) eq = function
| { a = true; b = Deq (None, _) }
| { a = true; b = Deq (Some _, _); c = lazy 1 }
| { a = false }
| { b = Deq (_, None) } -> 
   assert false
| { b = Deq (Some Refl, Some Refl) } ->
   Refl

let bad =
  let r = { a = true; b = deq1; c = lazy 1 } in
  r.c <- lazy (r.b <- deq2; 2);
  g r

let castint (type a) (Refl : (int, a) eq) (x : int) : a = x
let _ = print_string (castint bad 42)

This program uses mutation to change a field from "deq1" to "deq2" during matching, making it seem like the impossible "Deq (Some Refl, Some Refl)". (The behaviour is very dependent on the exact sequence of cases in "g", and seemingly-equivalent programs will often give different behaviour).

The text was updated successfully, but these errors were encountered:

vicuna · 2016-04-25T17:25:47Z

Comment author: @stedolan

Playing with it a bit more, the example can be simplified to the following, which does not use laziness:

type app = App : ('x -> unit) option * 'x -> app

let app1 = App (Some print_string, "hello")
let app2 = App (None, 42)

type t = { 
  a : bool; 
  mutable b : app
}

let f = function
| { a = false } -> assert false
| { a = true; b = App (None, _) } -> assert false 
| { a = true; b = App (Some _, _) } as r 
    when (r.b <- app2; false) -> assert false
| { b = App (Some f, x) } ->
   f x

let _ = f { a = true; b = app1 }

The issue is not the type-equality behaviour of GADTs, but the existential quantification. In this example, mutation causes the optimised pattern-matching to confuse the values bound under two different existential quantifiers. Either lazy patterns (as in the original example) or when guards (above) are enough to cause mutation during matching.

vicuna · 2016-04-26T09:48:04Z

Comment author: bvaugon

Remarks, the "when" is also broken. For example, the following code crash in the similar way:

type (_, _) eq = Refl : ('a, 'a) eq
type (_, _) deq = Deq : ('a, 'x) eq option * ('x, 'b) eq option -> ('a, 'b) deq
    
let deq1 = Deq (Some Refl, None)
let deq2 = Deq (None, Some Refl)
    
type ('a, 'b) t = { 
  a : bool; 
  mutable b : ('a, 'b) deq;
}
    
let r = { a = true; b = deq1 }

let g (type a) (type b) : (a, b) t -> (a, b) eq = function
  | { a = true; b = Deq (Some _, _) } when (r.b <- deq2; false) ->
    assert false
  | { a = true; b = Deq (None, _) }
  | { a = false }
  | { b = Deq (_, None) } ->
    assert false
  | { b = Deq (Some Refl, Some Refl) } ->
    Refl

let bad = g r
    
let castint (type a) (Refl : (int, a) eq) (x : int) : a = x
let _ = print_string (castint bad 42)

vicuna · 2016-07-13T07:53:02Z

Comment author: @alainfrisch

I did not try to produce an example, and this would be more tricky, but even without lazy or when guards, it is impossible to guarantee that arbitrary code won't be executed during pattern matching as soon as patterns need to allocate (e.g. to read from a float from unboxed records or arrays). These allocations can trigger the GC and thus finalizer which could modify mutable parts of the matched value. It might be possible to delay these allocations (at least until the "when" guard), but this would probably require some refactoring of the PM compiler. (Of course, the pattern can "traverse" the unboxed floats.)

vicuna · 2016-07-13T11:19:35Z

Comment author: @alainfrisch

Setting Target version to "later", since there is no clear resolution plan.

vicuna · 2016-07-19T05:40:50Z

Comment author: @garrigue

I don't even think that one needs existentials to do that.
Here is an example in core ocaml; it segfaults even after fixing Matching.check_partial (which currently only downgrade if the same pattern contains both mutable and lazy)

type u = {a: bool; mutable b: int option}

let f x =
  match x with
    {a=false} -> 0
  | {b=None} -> 1
  | _ when (x.b <- None; false) -> 2
  | {a=true; b=Some y} -> y

let _ = f {a=true; b=Some 5}

The (insufficient) fix in matching.ml is

let check_partial is_mutable is_lazy pat_act_list = function
  | Partial -> Partial
  | Total ->
      if
        pat_act_list = [] ||  (* allow empty case list *)
        List.exists (fun (pats, _) -> is_mutable pats) pat_act_list &&
        List.exists (fun (pats, lam) -> is_guarded lam || is_lazy pats)
          pat_act_list
      then Partial
      else Total

vicuna · 2016-07-19T08:20:36Z

Comment author: @alainfrisch

Frightening. Jacques: do you agree with my guess that this could even occur without when guards (with patterns that allocates and GC alarms/finalizers)?

vicuna · 2017-02-16T10:11:09Z

Comment author: @xavierleroy

Currenty worked on at #717

github-actions · 2020-06-01T04:19:38Z

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

yallop · 2020-06-01T05:19:35Z

The program in the original report still causes a segmentation fault, which seems reason enough to leave this open.

There's also been recent activity on this issue elsewhere. Under #717, @trefis wrote in October 2019:

That work is still ongoing, and we definitely hope to fix #7241 as a result of it.

bluddy · 2020-06-01T05:35:00Z

I wonder how other languages with pattern matching and mutation deal with this issue (F#, Scala...). Do they all suffer from the same unsoundness?
Also, is there any possible way to address this in a multicore world with mutable state shared between threads?

gasche · 2020-06-01T07:02:02Z

Whether one is unsound or not is a question of assumptions and optimizations made during matching. Very naive implementations would typically be correct, and slightly more elaborate implementations may very well be buggy just as ours.

Handling multicore is not a difficulty. What we need is to reason about the points during pattern-matching where side-effects may be performed. Currently this happens during evaluation of guards, float unboxing and lazy forcing; Multicore (or really any implementation) could introduce additional "poll points" during pattern-matching, but I don't think this is planned for now (there is no risk of undue latency caused by pattern-matching that would require polling, as it is always bounded in time and very fast).

github-actions · 2021-11-01T04:42:22Z

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

github-actions · 2022-11-04T04:29:54Z

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

gadmm · 2023-01-26T16:11:18Z

Regarding finalizers, etc., it might be helpful to consider a version of caml_garbage_collection that does not execute asynchronous callbacks. The machinery to do this already exists. This does not look like the biggest problem here.

gasche · 2023-06-16T08:53:37Z

I have a proposal to "fix this issue for good", but it requires more work and some research. I am writing a summary of the proposal below for future reference.

Background on pattern-matching contexts

The current pattern-matching compilation algorithm maintains some "context" information to optimize its generated code. During pattern-matching compilation we match on a row of inputs; we maintain a context that encodes statically-known information about this row of inputs. For example if you write

match x1, x2 with
| Some y, foo...
| bar...

then the compilation of the foo... will happen in a context where it is known that the first input x1 is of the shape Some _. Furthermore, if foo... is exhaustive (if it covers all possible values for x2), then the compilation of bar... will happeen in a context where we know that the first input x1 is of the shape None.

The bug

The problem we are discussing is that this "context" information, that is used to generate optimized matching code, can become invalid if the pattern-matching inputs are mutated in the middle of the matching logic.

Consider the simplest example of unsoundness due to Jacques Garrigue above:

type u = {a: bool; mutable b: int option}

let f x =
  match x with
  | {a = false; b = _} -> 0
  | {a = _;     b = None} -> 1
  | {a = _;     b = _} when (x.b <- None; false) -> 2
  | {a = true;  b = Some y} -> y

let _ = f {a=true; b=Some 5}

One way to think of what is going on here is as follows:

After the first clause, the compiler is in a context where a = true is known (because a = false failed, and the sub-patterns on b are exhaustive).
After the second clause, the compiler is in a context where a = true; b = Some _ is known (because b = None failed).
On the fourth clause, the compiler knows a = true; b = Some _, so it does not need to check a or the constructor of b again and directly emits a read to the first field of b.

This is in fact invalid because, on the third line, b was changed to None which does not have a first field.

Generated code

This "optimization based on static contexts" logic is apparent in the code generated for this example, obtained with -drawlambda -dno-locations, then some manual simplification for readability:

  (let
    (f/393 =
       (function x : int
         (let (xa =a (field_int 0 x))
           (catch
             (catch
               (if xa (exit 7) 0)
              with (7)
               (let (xb/1 =o (field_mut 1 x))
                 (catch (if xb/1 (exit 8) 1)
                  with (8)
                   (if (seq (setfield_ptr 1 x 0) 0) 2 (exit 6)))))
            with (6)
             (let
               (xb/2 =o (field_mut 1 x)
                y =a (field_imm 0 xb/2))
               y)))))

The first-clause check for a = false corresponds to (if xa (exit 7) 0): if xa is false return 0, otherwise move to the with (7) handler which continues matching at the toplevel of the input -- in a static optimization context where we know that xa is false.

The second-clause check for b = None corresponds to if xb/1 (exit 8) 1. exit 8 goes to the third clause, which mutates the second field of x and then fails back to an exit 6, that continues matching at the toplevel of the input for the fourth clause.

In the compilation of the fourth clause (the with (6) handler) we get xb again by accessing the second field of the input, and access its own first argument y.

The backtracking nature of pattern-matching compilation

Notice that the code contains two distinct accesses to the second field of x, which I renamed xb/1 and xb/2. The fourth clause is compiled in a lexical scope that does not contain the binding of xb/1, so it reads it again as xb/2.

In this particular example, this seems wasteful: we could have placed the "fourth clause logic", the with (6) handler, within the let xb/1 binding, given that the only place jumping to exit 6 is itself under the let xb/1 binding. Instead, exit 6 performs a sort of "backtracking" where we go back to the toplevel control flow (we are not under any switch on any of the inputs) and have to match on the whole input again.

The presence of such redundant reads would be avoidable in this example but is unavoidable in general: this "backtracking" aspect of the pattern-matching compiler is essential to avoid code-explosion issues. Backtracking allows distinct paths through the matching control flow to merge by jumping to the same exit, avoiding exponential code explosion. These code-explosion issues are not theoretical, they show up very quickly in practice when using or-patterns (or range patterns, etc.).

Note that the cost of this "backtracking" is greatly reduced by those static optimization contexts: we "backtrack" to the toplevel and have to match the whole input again, but in fact we know that xa is true and we don't have to match it, and we know that xb is Some _ and we don't have to branch on its value again. When there are several exit jumping to the same with-handler (so we actually share code to avoid duplication), the static optimization contexts of all the exits are merged/intersected with each other, to get the best static context for the with-handler.

Conversely, if we were not "backtracking" during compilation, we would not need these static optimization contexts.

Existing fix proposal: removing mutable subtrees from the context

The obvious fix for this issue is to remove from the static optimization context any knowledge that may be invalidated by mutation, that is, any sub-value whose path from the root (the scrutinee) contains a mutable read.

This is the approach proposed by Luc Maranget in #717

At the time where this fix was proposed, people were worried that systematically erasing mutable information from contexts would degrade pattern-matching performance too much -- there are indeed example in the wild where this pessimized code generation. Luc refined his approach to try to reason about when a mutation may occur (in a lazy pattern or on a when-guard), and only pessimize compilation when those features are used.

One problem with this approach is that this approximation of which pattern-matching may suffer from side-effects is incorrect. Reading a floating-point value may also generate side-effects (the allocation can switch control to another piece of code), and with OCaml 5.x we have to consider the case of arbitrary races from other domain -- so we would be back to pessimizing all the time.

Alternative proposal (sketch): changing the compilation strategy

My alternative proposal is to change the compilation strategy so that the arguments that we are matching over (the input row) coincides with the leaves of the static context. In other words, we carry extra inputs in intermediate steps of the pattern-matching compilation that corresponds to our static knowledge.

Consider our running example:

let f x =
  match x with
  | {a = false; b = _} -> 0
  | {a = _;     b = None} -> 1
  | {a = _;     b = _} when (x.b <- None; false) -> 2
  | {a = true;  b = Some y} -> y

After matching the second clause and b = None, we statically know that b is of the form Some y. My proposal is to add this variable y as a new argument to the input row at this point. This new variable must be present exactly at all the places of the pattern-matching compilation that have b = Some _ in their optimization context. (In particular, when backtracking through an exit, we may need to add y as an exit argument if it is part of the with-handler static context.)

Then when we reach the fourth row, we still have Some _ in our static optimization context. But instead of saving a branch on the value of b and reading its first field directly, we also optimize the read away by using the y variable that is now part of our input row.

On our example this would correspond to the following generated code:

  (let
    (f/393 =
       (function x : int
         (let (xa =a (field_int 0 x))
           (catch
             (catch
               (if xa (exit 7) 0)
              with (7)
               (let (xb =o (field_mut 1 x))
                 (catch
                   (if xb
                    (let y =a (field_imm 0 xb)
                     (exit 8 y))
                    1)
                  with (8 y)
                   (if (seq (setfield_ptr 1 x 0) 0) 2 (exit 6 y)))))
            with (6 y) y)))))

In this modified version, the branch in the second clause that adds b = Some y in the static context actually reads this y; it becomes part of the "input row" at this point. Then all parts of the generated code that still have b = Some _ in their static context (namely, the third and fourth clause) take this y as an extra input. When backtracking, it is passed around as an exit argument.

In summary: whenever we decide to gain some static knowledge of some part of the matched value, we also take a "snapshot" of this part that we carry around as long as we make this static assuption. "Using' this static knowledge is done by using this snapshot directly, instead of re-doing some reads of the inputs that could have been invalidated by a state update. This gives a sensible semantics, that is clearly safe in presence of mutation, without pessimizing compilation by adding new branches for safety.

(Note that this approach does not guarantee the absence of redundant reads: some control flow path may merge with other control flow paths with less static context information, and their shared code at this point may perform reads that are redundant for inputs coming from the more-informed exit point. Those reads are safe in presence of mutation.)

This is only a sketch, and I am not completely clear of the details. Do we want to systematically keep all the "leaves" of the static context as extra inputs? Maybe this is too many arguments in practice. Do we want to only keep the results of mutable reads, as we know that we can reconstruct immutable reads safely? Or maybe we could have some analysis of which of those leaves are going to be useful later, and only keep those. (Say we could have Node(left, _) in the static context, where only the left children of a binary tree has been named and added as an extra input.)

I'm not familiar enough with this part of the pattern-matching compilation strategy to tell which choices in this design space are feasible / not too invasive -- if any. My impression is that doing this nicely is more work than the code-pessimizing fix, but also more satisfying. This is, basically, research.

(Note: another approach to implement this would be to treat it as a post-pattern-matching optimization pass that removes redundant reads after some data-flow analysis. This is probably easier to implement as it can be kept independent from the pattern-matching machinery, but it is more heavy-weight -- we would need to compare code fragments for syntactic equality, etc. More importantly, it is less clear that the result is correct -- fixes the bug -- because there is no clear relation between the post-processing analysis and the potentially-unsafe decisions relying on static optimization contexts.)

Current plan

Both of these possible roads to a fix are delayed/blocked by the fact that no one knows this part of the pattern-matching compiler codebase well enough to implement or review them with confidence.

The first step for me would be to read and document the current implementation of static optimization contexts, to ensure that at least Luc, myself and other welcome volunteers understand the current codebase well enough to assess proposed changes to the implementation.

gasche · 2023-07-28T14:48:57Z

Today I worked on this again with @trefis, and we:

now understand why the current code does not prevent the bug
have yet another idea to solve this problem that may be simpler than my proposal above.

Why the bug can happen

One aspect of Luc's bugfix that has left us bewildered for seven years now is that there is already code in the compiler to handle this problem: the set_args_erase_mutable function is designed to "erase", in some situations, context information coming from mutable fields. Why is this protection mechanism not sufficient to prevent the bug, and why does Luc's bugfix add a different protection mechanism in addition to the first?

Contexts and jump summaries

When turning a pattern matrix into code, the pattern-matching compiler takes as input some static information on the match arguments called the "context". Along with the generated code, it returns a "jump summary", which gives a more precise version of this context for each raise to an exit point in the generated code -- that context has been refined by further splits occurring within the generated code.

Combining jump summaries

The pattern-matching compiler generates code that is formed of "switch" nodes that look somewhat like this:

match the head constructor of x with
| K1(_,_) -> <generated code here>
| K2(_,_,_) -> <generated code here>
| ...

We start from the current context, and refine it in each branch: the branch context remembers the constructor that was matched, and has a different arity based on the number of arguments. We then produce the generated code of the branch under this context, and receive a jump summary with the same branch-specific arity. At this point we must "combine" the jump summaries of all branches, re-applying the matched constructor to obtain a common arity corresponding to the input arguments.

When re-applying the head constructor, the pattern-matching compiler is careful to call the set_args_erase_mutable function, which erases sub-argument information corresponding to mutable fields.

Why the bug then?

type u = {a: bool; mutable b: int option}

let f x =
  match x with
  | {a = false; b = _} -> 0
  | {a = _;     b = None} -> 1
  | {a = _;     b = _} when (x.b <- None; false) -> 2
  | {a = true;  b = Some y} -> y

let _ = f {a=true; b=Some 5}

In our example, the problem comes from using static context on the b column when compiling the catch handler that corresponds to the very last clause. In this context we see a Some _ pattern for the b column, telling us (wrongly) that the value is a Some constructor.

At this point, the set_args_erase_mutable call that would save has not yet occurred, because we are still under/inside the record constructor. The dangerous information is at the "head" of a current column, no constructor has been re-applied on top of it.

gasche · 2023-07-28T14:50:44Z

The problem with the proposal above

The idea of my proposal above is to generate different code on exits depending on the context of its exit handler -- if the exit handler makes assumptions on certain mutable reads, include them as extra data in the exit.

The difficulty we encountered is that the context of the exit handler is only fully known once all exit points have been generated. We would have to backpatch the generated exit code at this point (eww!), or change the data structure / control flow of this part of the pattern-matching compiler to go through an intermediate representation where exit points are not yet fully known.

One possible way to do this is to change the return type of functions as follows:

(* before *)
.... -> Jumps.t * lambda

(* after *)
.... -> Jumps.t * (Final_jumps_info.t -> lambda)

but this sounds like a fairly invasive change.

An alternative proposal

Remember our example:

type u = {a: bool; mutable b: int option}

let f x =
  match x with
  | {a = false; b = _} -> 0
  | {a = _;     b = None} -> 1
  | {a = _;     b = _} when (x.b <- None; false) -> 2
  | {a = true;  b = Some y} -> y

let _ = f {a=true; b=Some 5}

And its currently-generated code:

  (let
    (f/393 =
       (function x : int
         (let (xa =a (field_int 0 x))
           (catch
             (catch
               (if xa (exit 7) 0)
              with (7)
               (let (xb/1 =o (field_mut 1 x))
                 (catch (if xb/1 (exit 8) 1)
                  with (8)
                   (if (seq (setfield_ptr 1 x 0) 0) 2 (exit 6)))))
            with (6)
             (let
               (xb/2 =o (field_mut 1 x)
                y =a (field_imm 0 xb/2))
               y)))))

My proposal was to amend the (exit 6) call to add xb/1 as additional argument, to "remember" the mutable load. This is useful because the catch handler is not in scope of xb/1 anymore.

However, we could have generated the code differently such that xb/1 is in scope:

  (let
    (f/393 =
       (function x : int
         (let (xa =a (field_int 0 x)
               xb =o (field_mut 1 x))
           (catch
             (catch
               (if xa (exit 7) 0)
              with (7)
                 (catch (if xb (exit 8) 1)
                  with (8)
                   (if (seq (setfield_ptr 1 x 0) 0) 2 (exit 6))))
            with (6)
             (let
               (y =a (field_imm 0 xb))
               y)))))

To do this the pattern-matching compiler should genrate the binding for the record fields at the point where it decomposes the record in the matrix, instead of generating the binding for each field at the point where the corresponding column is split.

This goes counter to the current implementation which is careful to delay those let-bindings as late as possible in the generation process. We are not sure how to implement it well and what would be the downsides (in general we could deduplicate some code but also add some unnecessary memory reads in some branches).

Is it enough to do it for all arguments, and let the "Let Alias" logic remove unused bindings?
Should we un-delay only the mutable reads to reduce the code pessimization impact?
Is there some Simplif-time cleanup (pushing aliases under conditionals, etc.) that we need to do, or that would recover the current code structure in most cases?

We believe that this idea suffices to fix the problem in general. Consider the right columns of the "context" at the exit handler. Each column corresponds to static knowledge on the arguments (scrutinees) of the pattern matrix, represented as a pattern, for example Some([_]):

For the sub-patterns below the head constructor, the information could be unstable if they correspond to a mutable field. But the current code is already careful to erase this information when pushing constructors on the right, this is the set_args_erase_mutable business.
For the constructor at the head of the pattern, the information could also be invalidated by mutation. Re-computing the field projection to access the argument may return a value that does not match this pattern -- this is the source of the bug. The change we propose is to guarantee that any argument that corresponds to a mutable field has a variable name bound to it -- by naming them eagerly -- which avoids the need to perform another mutable read on the field.

gasche · 2023-09-05T13:41:31Z

After working on this issue some more, I found a different example of unsoundness that is not related to context information, and requires a different sort of fix.

let r = ref (Some 3)

let _ =
  match Some r with
  | Some { contents = None } -> 0
  | _ when (r := None; false) -> 1
  | Some { contents = Some n } -> n
  | None -> 3
(* Segmentation fault (core dumped) *)

In this example, the context information when matching the line Some n does not claim that the input must have constructor Some at this point. This is because the constructor{ contents = _ } was pushed from the left to the right of the context when splitting on the line _ when ..., and set_args_erase_mutable erased any information from the context on its sub-value.

When matching the line Some n, the pattern-matching compiler does not have wrong context information, but it has wrong totality information: the type-checker has inferred that the whole match is total, and in particular the last clause must be total. The submatrix at this point contains a single row that matches the constructor Some _, plus a claim that it is total, so no check on the constructor is generated. One can check that passing the flag -safer-matching (which pessimizes totality information) fixes the segfault in this case -- we get a match failure as we could expect.

The [check_partial] heuristics are a coarse-grained approach degrade matching totality information in certain cases -- when clauses contain both reads of mutable fields and either guards or forcing of lazy patterns. It is not quite correct (it is not enough to prevent ocaml#7241), and is not sufficient conceptually for OCaml Multicore. In a Multicore world, other domains may race to update values concurrently, so we cannot assume that only a fixed set of matching constructions may result in side-effects. This heuristic is subsumed by the recent changes to: - make context information accurate by binding mutable fields eagerly - make totality information accurate by tracking the mutability of the current position and can now be retired.

The infamous matching bug ocaml#7241 corresponds to two distinct issues that arise when side-effects mutate the value being matched on: 1. Incorrect contexts: context information from the pre-mutation switches is retained. 2. Incorrect partiality information: type-checker-provided partiality information becomes incorrect. This commit fixes the first problem: incorrect contexts.

The [check_partial] heuristics are a coarse-grained approach degrade matching totality information in certain cases -- when clauses contain both reads of mutable fields and either guards or forcing of lazy patterns. It is not quite correct (it is not enough to prevent ocaml#7241), and is not sufficient conceptually for OCaml Multicore. In a Multicore world, other domains may race to update values concurrently, so we cannot assume that only a fixed set of matching constructions may result in side-effects. This heuristic is subsumed by the recent changes to: - make context information accurate by binding mutable fields eagerly - make totality information accurate by tracking the mutability of the current position and can now be retired.

gasche · 2023-09-16T13:03:36Z

Since last week I have a working fix for this issue, but there are several pieces which I am sending as separate PRs for easier review. (Each piece is faster to review, can be reviewed more carefully, and can be reviewed by different people.)

Here is a map of the current group of merged PRs, submitted PRs and upcoming branches.
(Last update: April 24th 2024)

Merged

More lambda/matching documentation #12442 : @trefis and myself did some refactoring and documentation of the pattern-matching compiler code (reviewed by @Octachron and @lthls)
Matching: improve the readability of the debug output #12532 and Matching: more debug #12553 improved the debug output of the pattern-matching compiler, which was very useful in understanding the bug and assessing potential fixes (reviewed by @trefis and @Octachron)
Test cases for 7241 #12548 is a set of tests for this issue, which help reviewing further PRs by looking at their test diffs (review by @maranget)
Matching: document and refactor mk_failaction_pos #12534 is a refactoring and documentation of mk_failaction_pos, a critical yet obscure function of the pattern-matching compiler
non-exhaustive fix for the Pattern Matching Bug #12555, which fixes the "incorrect context information" half of the bug (an alternative proposal was non-exhaustive fix for the Pattern Matching Bug, take two #12556, which corresponds to our initial work with @trefis)
matching: refactor compile_match* to clarify argument binding #13074, matching: minor simplification in do_for_multiple_match #13082, Matching: refactor to clarify the argument-passing structure #13084: minor code clarifications and refactoring
The Pattern-Matching Bug: special handling of final exits #13076, the final-exit branch, reworks the handling of the "final exit", the exit that jumps to Match_failure if nothing else matches

Submitted

(none right now, I am preparing fix-totality-2 below for submission)

Upcoming branches

matching-bug-fix-totality-2, builds on The Pattern-Matching Bug: special handling of final exits #13076 and Matching: refactor to clarify the argument-passing structure #13084 to implement a satisfying fix to the "incorrect totality information" second half of the bug. At this point we are done!
matching-bug-remove-check_partial, a branch on top of fix-totality-2 , simplifies the code by removing the previous heuristics on side-effects (the check_partial function and its dependencies) which are subsumed by the new code

ncik-roberts · 2023-12-08T14:21:34Z

@goldfirere mentioned to me that you were looking for a reviewer for some pieces of this. Let me know how I can be helpful here — what pieces are you seeking review on?

gasche · 2023-12-08T21:16:48Z

Thanks! #12534 is waiting for a review, and also #12555. The two PRs are independent.

(There was a decision to make between #12555 and #12556 which are competing approaches to solve the same problem, but I decided to go for #12555 and closed #12556. Feel free to have a look at both if you are interested in the nitty gritty details.)

ncik-roberts · 2024-01-29T19:25:14Z

(Apologies for the silence. I haven't forgotten about this — I hope to get started sometime in the next two weeks.)

gasche · 2024-01-29T21:31:23Z

Great news, thanks! (I was thinking of pinging again, but turns out this was also on your mind.)

The infamous matching bug ocaml#7241 corresponds to two distinct issues that arise when side-effects mutate the value being matched on: 1. Incorrect contexts: context information from the pre-mutation switches is retained. 2. Incorrect partiality information: type-checker-provided partiality information becomes incorrect. This commit fixes the first problem: incorrect contexts.

The [check_partial] heuristics are a coarse-grained approach degrade matching totality information in certain cases -- when clauses contain both reads of mutable fields and either guards or forcing of lazy patterns. It is not quite correct (it is not enough to prevent ocaml#7241), and is not sufficient conceptually for OCaml Multicore. In a Multicore world, other domains may race to update values concurrently, so we cannot assume that only a fixed set of matching constructions may result in side-effects. This heuristic is subsumed by the recent changes to: - make context information accurate by binding mutable fields eagerly - make totality information accurate by tracking the mutability of the current position and can now be retired.

vicuna added the typing label Mar 14, 2019

vicuna assigned maranget Mar 14, 2019

vicuna mentioned this issue Apr 25, 2016

Crash when pattern-matching lazy values modifies the scrutinee #5992

Closed

vicuna added the bug label Mar 20, 2019

trefis mentioned this issue Oct 14, 2019

Fix MPR7241: Pattern matching with mutable and lazy patterns is unsound #717

Open

github-actions bot added the Stale label Jun 1, 2020

gasche removed the Stale label Jun 1, 2020

github-actions bot added the Stale label Nov 1, 2021

Octachron removed the Stale label Nov 2, 2021

bluddy mentioned this issue Jan 24, 2022

Safety of some non-atomic loads generated by the compiler #10944

Closed

github-actions bot added the Stale label Nov 4, 2022

gasche removed the Stale label Nov 4, 2022

oxalica mentioned this issue Mar 15, 2023

Tracking issue: deref patterns rust-lang/rust#87121

Open

6 tasks

This was referenced Jul 28, 2023

parmatch: bugfix? in set_args_erase_mutable #12441

Merged

More lambda/matching documentation #12442

Merged

gasche added a commit to gasche/ocaml that referenced this issue Sep 3, 2023

document the fix to ocaml#7241

ad7da61

gasche mentioned this issue Sep 5, 2023

Matching: improve the readability of the debug output #12532

Merged

gasche mentioned this issue Sep 6, 2023

Matching: document and refactor mk_failaction_pos #12534

Merged

gasche added a commit to gasche/ocaml that referenced this issue Sep 8, 2023

document the fix to ocaml#7241

1f20bbf

This was referenced Sep 11, 2023

Test cases for 7241 #12548

Merged

Matching: more debug #12553

Merged

gasche mentioned this issue Sep 13, 2023

non-exhaustive fix for the Pattern Matching Bug #12555

Merged

gasche mentioned this issue Sep 14, 2023

non-exhaustive fix for the Pattern Matching Bug, take two #12556

Closed

gasche mentioned this issue Apr 6, 2024

The Pattern-Matching Bug: special handling of final exits #13076

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pattern matching with mutable and lazy patterns is unsound #7241

Pattern matching with mutable and lazy patterns is unsound #7241

vicuna commented Apr 25, 2016

vicuna commented Apr 25, 2016

vicuna commented Apr 26, 2016 •

edited by damiendoligez

vicuna commented Jul 13, 2016

vicuna commented Jul 13, 2016

vicuna commented Jul 19, 2016 •

edited by trefis

vicuna commented Jul 19, 2016

vicuna commented Feb 16, 2017

github-actions bot commented Jun 1, 2020

yallop commented Jun 1, 2020

bluddy commented Jun 1, 2020

gasche commented Jun 1, 2020 •

edited

github-actions bot commented Nov 1, 2021

github-actions bot commented Nov 4, 2022

gadmm commented Jan 26, 2023

gasche commented Jun 16, 2023

gasche commented Jul 28, 2023

gasche commented Jul 28, 2023

gasche commented Sep 5, 2023

gasche commented Sep 16, 2023 •

edited

ncik-roberts commented Dec 8, 2023

gasche commented Dec 8, 2023 •

edited

ncik-roberts commented Jan 29, 2024

gasche commented Jan 29, 2024

Pattern matching with mutable and lazy patterns is unsound #7241

Pattern matching with mutable and lazy patterns is unsound #7241

Comments

vicuna commented Apr 25, 2016

Bug description

vicuna commented Apr 25, 2016

vicuna commented Apr 26, 2016 • edited by damiendoligez

vicuna commented Jul 13, 2016

vicuna commented Jul 13, 2016

vicuna commented Jul 19, 2016 • edited by trefis

vicuna commented Jul 19, 2016

vicuna commented Feb 16, 2017

github-actions bot commented Jun 1, 2020

yallop commented Jun 1, 2020

bluddy commented Jun 1, 2020

gasche commented Jun 1, 2020 • edited

github-actions bot commented Nov 1, 2021

github-actions bot commented Nov 4, 2022

gadmm commented Jan 26, 2023

gasche commented Jun 16, 2023

Background on pattern-matching contexts

The bug

Generated code

The backtracking nature of pattern-matching compilation

Existing fix proposal: removing mutable subtrees from the context

Alternative proposal (sketch): changing the compilation strategy

Current plan

gasche commented Jul 28, 2023

Why the bug can happen

Contexts and jump summaries

Combining jump summaries

Why the bug then?

gasche commented Jul 28, 2023

The problem with the proposal above

An alternative proposal

gasche commented Sep 5, 2023

gasche commented Sep 16, 2023 • edited

Merged

Submitted

Upcoming branches

ncik-roberts commented Dec 8, 2023

gasche commented Dec 8, 2023 • edited

ncik-roberts commented Jan 29, 2024

gasche commented Jan 29, 2024

vicuna commented Apr 26, 2016 •

edited by damiendoligez

vicuna commented Jul 19, 2016 •

edited by trefis

gasche commented Jun 1, 2020 •

edited

gasche commented Sep 16, 2023 •

edited

gasche commented Dec 8, 2023 •

edited