Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash when pattern-matching lazy values modifies the scrutinee #5992

Closed
vicuna opened this issue Apr 19, 2013 · 14 comments
Closed

Crash when pattern-matching lazy values modifies the scrutinee #5992

vicuna opened this issue Apr 19, 2013 · 14 comments
Assignees

Comments

@vicuna
Copy link

vicuna commented Apr 19, 2013

Original bug ID: 5992
Reporter: @yallop
Assigned to: @maranget
Status: closed (set by @xavierleroy on 2015-12-11T18:18:47Z)
Resolution: fixed
Priority: normal
Severity: major
Category: back end (clambda to assembly)
Related to: #7241
Monitored by: @gasche

Bug description

$ cat lazypatterns.ml 
type ('a, 'b) eq = Refl : ('a, 'a) eq

let cast (type a) (type b) (Refl : (a, b) eq) (x : a) = (x : b)

type ('a, 'b) either = L of 'a | R of 'b

type e = ((bool, bool) eq, (int, string) eq) either

let f : unit Lazy.t * unit Lazy.t * bool ref * e -> string = function
  | lazy (), _, {contents=false}, L x  -> "ok"
  | _, lazy (), {contents=true}, L x -> "ok"
  | _, _, _, R refl -> cast refl 0 ^ "not ok"

let s = ref false
let set_true = lazy (s := true)
let set_false = lazy (s := false)

let _ = f (set_true, set_false, s, L Refl)
$ ocaml lazypatterns.ml 
Segmentation fault (core dumped)

File attachments

@vicuna
Copy link
Author

vicuna commented Apr 19, 2013

Comment author: meyer

I can't reproduce crash on 4.00.1 32 bits Linux. Could you provide a gdb backtrace?

thanks

@vicuna
Copy link
Author

vicuna commented Apr 19, 2013

Comment author: @yallop

It appears to crash with ocamlc, but raise Match_failure with ocamlopt.

 Program received signal SIGSEGV, Segmentation fault.
 0x000000000040f56a in caml_ml_string_length ()
 (gdb) where
 #0  0x000000000040f56a in caml_ml_string_length ()
 #1  0x000000000041c4fe in caml_interprete ()
 #2  0x000000000041dcb1 in caml_main ()
 #3  0x000000000041b1fc in main ()

@vicuna
Copy link
Author

vicuna commented Apr 19, 2013

Comment author: @garrigue

Interesting.
But I don't think this is related to GADTs.

let f = function
 | lazy (), _, {contents=None} -> "None", 0
 | _, lazy (), {contents=Some x} -> "Some", x

let s = ref None
let set_true = lazy (s := Some 1)
let set_false = lazy (s := None)

let _ = f (set_true, set_false, s)

Lazy patterns are not supposed to have side-effects, but what should we do when they have?

@vicuna
Copy link
Author

vicuna commented Apr 19, 2013

Comment author: @lefessan

It is a known bug, lazy evaluation in pattern matching introduces the possibility to have side-effects that modify the values being pattern matched (it was actually discussed last Monday during the "bug meeting" that the local core team had).

We should probably change the way the pattern-matching is compiled in the presence of a combination of lazy values and mutable values, to have a less efficient but safer execution.

@vicuna
Copy link
Author

vicuna commented Apr 19, 2013

Comment author: @alainfrisch

When-guards can already change parts of the scrutinee. This is less powerful than lazy patterns, since this cannot be done within a single branch, but I guess there are some provisions to avoid bad surprises in that case (or not?). Could we use the same technique for lazy patterns?

@vicuna
Copy link
Author

vicuna commented Apr 21, 2013

Comment author: @lpw25

When-guards can already change parts of the scrutinee. This is less powerful than lazy patterns, since this cannot be done within a single branch, but I guess there are some provisions to avoid bad surprises in that case (or not?).

This is done by always assuming that patterns containing when-guards are non-exhaustive, so that they raise a match failure exception instead of seg fault. As in the following example:

  $ cat bad-when.ml 
  let g guard = function
    | (), _, {contents=false}, None -> "false"
    | _, _, _, _ when guard () -> "guard"
    | _, (), {contents=true}, None -> "true"
    | _, _, _, Some s -> s ^ "not ok"

  let s = ref true
  let _ = print_string (g (fun () -> s := false;false) ((), (), s, None))

  $ ocaml bad-when.ml 
  Exception: Match_failure ("bad-when.ml", 1, 14).

The same thing could probably be done for lazy patterns.

@vicuna
Copy link
Author

vicuna commented Apr 21, 2013

Comment author: @lpw25

The same thing could probably be done for lazy patterns.

I've attached a patch that does this. It certainly fixes Jeremy's example:

./ocaml ~/lazypatterns.ml
Exception: Match_failure ("/home/leo/lazypatterns.ml", 9, 61).

@vicuna
Copy link
Author

vicuna commented Apr 21, 2013

Comment author: @lefessan

I have two concerns with this solution:
(1) what is the semantics ? Currently, I cannot explain the semantics of "when" guards in the presence of side effects. For example:

## let x = ref 0;;
let _ =
    match x with
      { contents = 1 } -> 1
    | { contents = 0 } when (x := 1; false) -> 2
    | { contents = 1 } -> 2
    | _ -> 3
Warning 11: this match case is unused.
- : int = 3

For me, this is already a bug : the semantics should be that clauses are tested one line after the other one, and so, line "| { contents = 1 } -> 2" should be triggered, which is not the case. Using the same solution means increasing the problem instead of solving it...

(2) I think that, since pattern-matching is compiled on the typed tree, we should use the information that the matched value is immutable to enable optimal compilation only when it preserves the semantics, i.e. in the absence of either mutable values, OR in the absence of when/lazy constructs.

@vicuna
Copy link
Author

vicuna commented Apr 21, 2013

Comment author: @lpw25

(1) what is the semantics ?

A relaxed semantics: the ordering between effects and matches is unspecified and any program relying on it is essentially incorrect. Of course a more well-specified semantics could be implemented, but only at the expense of efficiency.

Using the same solution means increasing the problem instead of solving it...

These are really separate problems. One is a soundness bug due to using "incorrect" exhaustiveness information to guide an optimisation. The other is about whether matching in the presence of side-effects should have a well-specified semantics.

(2) I think that, since pattern-matching is compiled on the typed tree, we should use the information that the matched value is immutable to enable optimal compilation only when it preserves the semantics, i.e. in the absence of either mutable values, OR in the absence of when/lazy constructs.

I certainly agree that the patterns only need to be marked as non-exhaustive if they access mutable values.

@vicuna
Copy link
Author

vicuna commented Apr 22, 2013

Comment author: @yallop

I agree with Leo that there are two distinct issues here: the optimization based on the mistaken assumption of exhaustiveness, and the question of the order of evaluation for pattern matching. It's clear that we need to be more careful about assuming exhaustiveness when the match includes both lazy patterns and mutable data.

I agree with Fabrice that the current behaviour of both lazy patterns and guards is unsatisfactory with respect to ordering. People writing OCaml programs will generally assume that the semantics of matching is to consider the pattern cases top-to-bottom, left-to-right. Although the manual is careful to avoid actually saying anything about the order of matching, several OCaml books say that matching is sequential:

"When a match expression is evaluated, the expression expression to be matched is first evaluated, and its value is compared with the patterns in order."
Introduction to Objective Caml
Jason Hickey

"The expression expr is matched sequentially to the various patterns p1, ..., pn."
Developing Applications With OCaml
Emmanuel Chailloux, Pascal Manoury, Bruno Pagano

"Cases are explored in a top-down fashion: when a branch fails, the analysis resumes with the next possible branch. However, the analysis stops as soon as the branch is successful; then, its right hand side is evaluated and returned as result."
Using, Understanding, and Unraveling The OCaml Language From Practice to Theory and vice versa
Didier Rémy

@vicuna
Copy link
Author

vicuna commented Apr 22, 2013

Comment author: @alainfrisch

In addition to changing the compilation scheme in presence of both mutable fields in the pattern and the possibility of side-effects (when-guard or lazy patterns), it would be good to report this as a warning.

@vicuna
Copy link
Author

vicuna commented Apr 25, 2013

Comment author: @maranget

The bug is now fixed, in the sense that the proposed example
does not segfault any longer.

Luc

@vicuna
Copy link
Author

vicuna commented Apr 25, 2013

Comment author: @xavierleroy

@lefessan:

what is the semantics ? Currently, I cannot explain the semantics of "when" guards in the presence of side effects.

It is entirely unspecified when those side effects occur w.r.t. the matching operation itself.

For me, this is already a bug : the semantics should be that clauses are tested one line after the other one

I disagree. Pattern-matching is such an essential operation of the language that we want full liberty to compile in whatever testing order is efficient. And code such as your example should never be written in the first place.

@yallop:

Although the manual is careful to avoid actually saying anything about the order of matching, several OCaml books say that matching is sequential:

They are all wrong.

@vicuna
Copy link
Author

vicuna commented Apr 25, 2013

Comment author: @yallop

Thanks, Luc!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants