Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major GC crashes on fragmented heap with large number of block size and first_fit policy #7815

Closed
vicuna opened this issue Jun 27, 2018 · 4 comments
Assignees
Milestone

Comments

@vicuna
Copy link

vicuna commented Jun 27, 2018

Original bug ID: 7815
Reporter: joris
Assigned to: @damiendoligez
Status: resolved (set by @xavierleroy on 2018-07-19T15:56:08Z)
Resolution: fixed
Priority: normal
Severity: crash
Platform: amd64
OS: linux
Version: 4.06.1
Target version: 4.07.1+dev/rc1
Fixed in version: 4.07.1+dev/rc1
Category: runtime system and C interface
Monitored by: @nojb @gasche @ygrek @jmeber

Bug description

Calling Gc.full_major (or simply major) after allocating thousands of block of different word size and freeing half of them with first_fit policy makes the runtime crash with no meaningful trace

Steps to reproduce

Compile the attached file with
opam install gperftools # install tcmalloc

ocamlfind ocamlopt -g -linkpkg -package gperftools minimal.ml

and run it. On my machine it also crashes with jemalloc with rounds = 3 and nr_blocks = 20000

opam install jemalloc
ocamlfind ocamlopt -g -linkpkg -package jemalloc_ctl minimal.ml

Additional information

Background:
This code is an un-natural example and it was obviously hand-crafted.

I came across this issue after trying to reproduce a bug in minor gc and caml_fl_allocate which under certain conditions makes minor gc run forever (or at least quadratically, but for half an hour on small heaps at least).

I was trying to fill the flp to see what would happen in this case. I don't know yet if those two issues are related

File attachments

@vicuna
Copy link
Author

vicuna commented Jun 27, 2018

Comment author: @stedolan

That's a nice reproduction case!

This bug is reproducible by running the bytecode interpreter under valgrind, which should make debugging easier. It seems to be independent of the choice of malloc, although it doesn't seem to actually segfault with the default glibc allocator.

In freelist.c, there's this loop:

    value buf [FLP_MAX];
    int j = 0;
    mlsize_t oldsz = sz;

    prev = flp[i];
    while (prev != flp[i+1]){
      cur = Next (prev);
      sz = Wosize_bp (cur);
      if (sz > prevsz){
        buf[j++] = prev;
        prevsz = sz;
        if (sz >= oldsz){
          CAMLassert (sz == oldsz);
          break;
        }
      }
      prev = cur;
    }

This example causes 'buf[j++] = prev' (line 345) to run more than FLP_MAX times, overflowing the buffer.

@vicuna
Copy link
Author

vicuna commented Jul 11, 2018

Comment author: @damiendoligez

This is embarrassingly easy to fix, see #1896

@joris: do you want to be credited in the changelog with your real name?
@stedolan: would you like to review the fix?

@vicuna
Copy link
Author

vicuna commented Jul 11, 2018

Comment author: joris

Awesome !

As you wish, i don't know what's the usual practice. In anycase my name is Joris Giovannangeli.

@vicuna
Copy link
Author

vicuna commented Jul 19, 2018

Comment author: @xavierleroy

Commits 802ebbf (trunk) and 1bea41f (4.07 branch)

@vicuna vicuna closed this as completed Jul 19, 2018
@vicuna vicuna added the stdlib label Mar 14, 2019
@vicuna vicuna added this to the 4.07.1 milestone Mar 14, 2019
@vicuna vicuna added the bug label Mar 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants