Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missed Type-error leads to a segfault upon record access #6505

Closed
vicuna opened this issue Jul 31, 2014 · 13 comments
Closed

Missed Type-error leads to a segfault upon record access #6505

vicuna opened this issue Jul 31, 2014 · 13 comments
Assignees

Comments

@vicuna
Copy link

vicuna commented Jul 31, 2014

Original bug ID: 6505
Reporter: choeger
Assigned to: @garrigue
Status: closed (set by @xavierleroy on 2017-09-24T15:33:04Z)
Resolution: fixed
Priority: low
Severity: crash
Version: 4.01.0
Fixed in version: 4.04.0 +dev / +beta1 / +beta2
Category: typing
Related to: #5343
Parent of: #6496
Monitored by: @yallop

Bug description

This is a rather large and complex codebase, so forgive me that this bug report is not (yet) in a usable form. I hope for the active help from OCaml devs here to nail down the exact source and a minimal triggering example.

Anyway. In the code found here:

https://github.com/AMSUN-Berlin/ocsimrun/tree/typecheck_bug

I encountered something that seems to be a type-error that went unnoticed by OCaml 4.01
In sim.ml I put a record (sim_state) in my extensible state monad (e.g. here: https://github.com/AMSUN-Berlin/ocsimrun/blob/typecheck_bug/simulation/sim.ml#L193 ), and in my test-case, I get an option value (e.g. here: https://github.com/AMSUN-Berlin/ocsimrun/blob/typecheck_bug/tests/simTests.ml#L50 ). Obviously, I made a simple mistake in sim.ml, but the typechecker won't take notice of it.

In the consequence, this test program crashes with a segmentation fault.

Steps to reproduce

The branch is already somewhat stripped down. To compile it, you will need the dependencies (all installable via opam):

opam install pa_monad_custom, batteries, pa_comprehension, ounit

Then you can checkout the demonstration code:

git clone https://github.com/AMSUN-Berlin/ocsimrun.git

git checkout typecheck_bug

To build:

oasis setup && ./configure && make

And run the test:

./simTests.native

Additional information

As mentioned, I am aware that the code is quite large. Any help/hints to strip it down further is appreciated. Unfortunately, I don't have the time to dig it out all alone, but I felt such a bug is too important to be left unnoticed.

File attachments

@vicuna
Copy link
Author

vicuna commented Jul 31, 2014

Comment author: @damiendoligez

I followed your steps to reproduce, but "make" doesn't seem to build "simTest.native" or any other executable.

@vicuna
Copy link
Author

vicuna commented Jul 31, 2014

Comment author: @yallop

Here's a simpler example:

$ cat bang.ml
module M :
sig
type 'o is_an_object = < .. > as 'o
and 'o abs constraint 'o = 'o is_an_object
val abs : 'o is_an_object -> 'o abs
val unabs : 'o abs -> 'o
end =
struct
type 'o is_an_object = < .. > as 'o
and 'o abs = 'o constraint 'o = 'o is_an_object
let abs v = v
let unabs v = v
end

class c =
object
val _y = (None : 'o M.abs option )
method get_y = _y
method set_y y = {<_y = y>}
end

let c' = (new c)#set_y (Some (M.abs (object end)))
let _ = match c'#get_y with
Some y -> (M.unabs y)#bang
| None -> ()

$ ocaml bang.ml
Segmentation fault

@vicuna
Copy link
Author

vicuna commented Jul 31, 2014

Comment author: choeger

@dolligez: Sorry, you have to run
./configure --enable-tests

@vicuna
Copy link
Author

vicuna commented Aug 1, 2014

Comment author: @garrigue

In yallop's example, yhe problem here is apparent in the signature inferred for M:

module M :
sig
type 'o is_an_object = 'o constraint 'o = < .. >
and 'a abs constraint 'a = 'a is_an_object
val abs : ('a is_an_object as 'a) is_an_object -> 'a abs
val unabs : ('a is_an_object as 'a) abs -> 'a
end

The type ('a is_an_object as 'a) does not make sense.
It ends up hiding the fact it contains a free type variable, which allows in turn the class to leak it.

@vicuna
Copy link
Author

vicuna commented Aug 2, 2014

Comment author: @garrigue

Fixed in branch 4.02, at revision 15038.

Thank you Jeremy for providing a compact reproduction case.
Since opam's batteries don't compile with 4.02, I have only tried with bang.ml, which now gives:

Error: The definition of abs contains a cycle:
'a is_an_object as 'a

Actually, I'm rather surprised you could trigger this bug just by sheer chance, because it is a conjunction of a lot of things.
Basically, what happened is that, in order to allow polymorphic uses of abbreviations in mutually recursive type definitions, checking proper use of abbreviations is delayed until a second phase, one the types are defined. However, it only checked that each abbreviation is contractive (i.e. doesn't expand to itself).
In your case, what happens is that, a combination of constraints during type definition ends up creating a non-contractive recursive type inside the definition of abs. This was not checked, at it was assumed that checking for each abbreviation to be contractive would be sufficient.
Why is it bad? The first reason is it can make the typechecker loop. However, this didn't occur here.
What happened is that this non-contractive recursive type ended up hiding a type variable, very much as in #5343.
This hidden type variable could then be used to change the type of anything, as in another already known problem with constraints / GADTs.
The fix is to check for cycles in all type definitions. This is made efficient by assuming that only freshly defined types may initiate such a cycle.

@vicuna
Copy link
Author

vicuna commented Aug 3, 2014

Comment author: choeger

Does that mean, you cannot define a constraint now, where you use the a type variable on the rhs of object methods? I am relying on the ability to express class type constraints, where methods deliver a new object of the class that fulfills the constraint.

i.e. an iterator would have the type:

type 'a iterator = <next : 'a option ; ..> as 'a

@vicuna
Copy link
Author

vicuna commented Aug 4, 2014

Comment author: @garrigue

No, the algorithm just checked that you are not creating "black holes", I.e. cycles that do not contain any type constructor. A cycle going through an object type is fine.

@vicuna
Copy link
Author

vicuna commented Jun 13, 2016

Comment author: @garrigue

Looking at the output of the test case, the behavior with or without -principal is different, so I reopen this report. (Not dangerous, but disturbing)

@vicuna
Copy link
Author

vicuna commented Jun 19, 2016

Comment author: @stedolan

The problem still seems to be there. The recursion in 'abs' is inessential, and here's a slight variant without it:

$ ocaml -version
The OCaml toplevel, version 4.03.0
$ cat bang.ml
module M :
sig
type 'o is_an_object = < .. > as 'o
and ('k,'o) abs constraint 'k = 'o is_an_object
val abs : 'o is_an_object -> ('o,'o) abs
val unabs : ('o, 'o) abs -> 'o
end =
struct
type 'o is_an_object = < .. > as 'o
and ('k,'o) abs = 'o constraint 'k = 'o is_an_object
let abs v = v
let unabs v = v
end

class c =
object
val _y = (None : ('o,'o) M.abs option )
method get_y = _y
method set_y y = {<_y = y>}
end

let c' = (new c)#set_y (Some (M.abs (object end)))
let _ = match c'#get_y with
Some y -> (M.unabs y)#bang
| None -> ()
$ ocaml bang.ml
Segmentation fault

@vicuna
Copy link
Author

vicuna commented Jun 19, 2016

Comment author: @stedolan

Hmm, this seems to be a regression. My version does not segfault in 4.02.1, while in 4.03 even this shorter program crashes:

type 'o is_an_object = < .. > as 'o
and ('k,'l) abs = 'l constraint 'k = 'l is_an_object
let y : ('o, 'o) abs = object end
let _ = y#bang

Very oddly, in 4.02.1 this program does not crash when run with "ocaml bang.ml", but does crash when copy-pasted into the toplevel.

@vicuna
Copy link
Author

vicuna commented Jun 21, 2016

Comment author: @lpw25

Same issue for polymorphic variants:

type 'v is_a_variant = [> ] as 'v
and ('k,'l) abs = 'l constraint 'k = 'l is_a_variant
let x : ('a, 'a) abs = Foo 6 let () = print_endline (match x with Bar s -> s)

@vicuna
Copy link
Author

vicuna commented Jun 22, 2016

Comment author: @garrigue

New reports fixed in trunk by commit 3c14f2e.
The issue was orthogonal: unfy_var was missing one check.

Still working on the problem with -principal.

@vicuna
Copy link
Author

vicuna commented Jul 5, 2016

Comment author: @garrigue

Fixed in trunk by commit 848c20f.

In order to properly fix this in the -principal case to, had to make the handling of recursive types more restrictive in mutually recursive definitions.
They are now seen as non-contractive, meaning that the following definition is rejected:
type 'a u = < x : 'a>
and 'a t = 'a t u;;
while the non-recursive version is fine:
type 'a u = < x : 'a>
type 'a t = 'a t u;;

Error messages are not so good, but I see no easy way to avoid it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants