Leak in fscanf #6473

vicuna · 2014-06-27T14:31:32Z

Original bug ID: 6473
Reporter: @yallop
Assigned to: @pierreweis
Status: closed (set by @pierreweis on 2018-06-06T09:06:55Z)
Resolution: fixed
Priority: low
Severity: minor
Category: standard library
Monitored by: @gasche

Bug description

(Reported by Jean-Vincent Loddo on caml-list).

The 'memo' table in the Scanf module associates a lookahead buffer with each input channel:

https://github.com/ocaml/ocaml/blob/49d3f7b9f/stdlib/scanf.ml#L393

as explained by a comment in the Scanf code:

https://github.com/ocaml/ocaml/blob/49d3f7b9f/stdlib/scanf.ml#L268-L320

Entries are added to the table for each input channel used for scanning, but there's no mechanism for removing entries, so the table only increases in size, retaining references to previously-used channels.

vicuna · 2014-06-29T18:22:40Z

Comment author: @pierreweis

Thank you for the bug report.

This problem was due to a short sighted implementation of Scanning.close_in. In the next version, calling Scanning.close_in ic will release the data associated with ic in the memo table.

vicuna · 2015-02-06T19:22:17Z

Comment author: @damiendoligez

Removing entries when the buffers get closed is a good first step, but not a complete solution because the user can just drop all references to the buffer (which then gets garbage-collected) without closing it.

A complete solution will have to use weak pointers.

vicuna · 2015-12-02T23:19:19Z

Comment author: @alainfrisch

fscanf/kfscanf are now marked as deprecated. Was the leak only related to them? If so, one could probably mark this ticket as resolved for 4.03.

vicuna · 2015-12-03T06:42:20Z

Comment author: @gasche

No, bscanf also has the problem as soon as you open several scanning buffers per i/o channel (which is exactly what is done by code converting fscanf calls following the deprecation warning advice; sharing scanning buffers would require extensive reworking of the user code).

vicuna · 2015-12-16T17:59:28Z

Comment author: @pierreweis

The memory leak is only related to fscanf/kfscanf and deprecating them removes the issue and closes the ticket.

To be clear, bscanf has no memory leaks and poses no semantics problems.

For sure, opening several scanning buffers for a given Pervasives.in_channel value is a bad idea and could create confusion. However, this bad idea only creates confusion in what you expect to read: what you get is correct.

The same problem already exists when reading with low level primitives: if you open more than one Pervasives.in_channel reading from the same file and then randomly read from any of those channels, you may have a hard time to understand what happens, but semantics is still ok.

vicuna · 2015-12-17T14:09:23Z

Comment author: @alainfrisch

@gasche, @weis: can you please try to reach a consensus on whether this is resolved?

vicuna · 2018-06-06T09:06:55Z

Comment author: @pierreweis

Deprecation of faulty functions solved the problem in version 4.03.

vicuna closed this as completed Jun 6, 2018

vicuna added the stdlib label Mar 14, 2019

vicuna assigned pierreweis Mar 14, 2019

vicuna added the bug label Mar 20, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Leak in fscanf #6473

Leak in fscanf #6473

vicuna commented Jun 27, 2014

vicuna commented Jun 29, 2014

vicuna commented Feb 6, 2015

vicuna commented Dec 2, 2015

vicuna commented Dec 3, 2015

vicuna commented Dec 16, 2015

vicuna commented Dec 17, 2015

vicuna commented Jun 6, 2018

Leak in fscanf #6473

Leak in fscanf #6473

Comments

vicuna commented Jun 27, 2014

Bug description

vicuna commented Jun 29, 2014

vicuna commented Feb 6, 2015

vicuna commented Dec 2, 2015

vicuna commented Dec 3, 2015

vicuna commented Dec 16, 2015

vicuna commented Dec 17, 2015

vicuna commented Jun 6, 2018