Mantis Bug Tracker

View Issue Details Jump to Notes ] Issue History ] Print ]
IDProjectCategoryView StatusDate SubmittedLast Update
0004747OCamlOCaml standard librarypublic2009-03-16 10:352012-09-17 13:51
Reportershinwell 
Assigned Togasche 
PriorityhighSeverityminorReproducibilityalways
StatusassignedResolutionopen 
PlatformOSOS Version
Product Version3.11.0 
Target Version4.00.2+devFixed in Version 
Summary0004747: Hashtbl.resize is not tail recursive
DescriptionThe function in Hashtbl.resize that copies the linked list attached to one bucket isn't tail recursive, which means that if you have the (admittedly undesirable) situation of a very long chain of buckets it can cause a stack overflow.

Degenerate example:

let ht = Hashtbl.create 100
let () = for x = 1 to 2500000 do Hashtbl.add ht 0 () done
TagsNo tags attached.
Attached Filespatch file icon hashtbl.patch [^] (1,130 bytes) 2012-06-20 14:31 [Show Content]

- Relationships

-  Notes
(0007590)
varobert (reporter)
2012-06-20 14:34

I encountered this bug while using Coq extraction feature, because of a misuse of Hashtbl.add instead of Hashtbl.replace that has been reported upstream.

I attached a proposed patch.

Beware that other functions build results in a non-tail-rec fashion over buckets:
remove_bucket
find_in_bucket
replace_bucket

Therefore, this patch may only delay stack overflows some more, as long as it can shrink buckets enough. A more thorough patch would probably incur both performance and efficiency loss.

This patch will slow down the resizing process too, since it builds the reversed bucket first.
(0007730)
gasche (developer)
2012-07-12 14:28

Valentin Robert has given me a full patch. However, due to the non-critical status¹ and potential for performance regressions, there is no need to do that before the release. I'm changing the "target version" to reflect this.

¹: If the bucket becomes so long that the overflows becomes a problem, it means that there is a problem with the way hashtables are used. If it comes from hashing conflicts, you will get terrible performances due to the linked-list bucket implementation (linear cost for finding a given element), and a difference implementation must be preferred. If it comes from an over-use of the multiple binding semantics ("adding" a lot of different values on the same key), you should probably switch to a different data structure. One might argue that having a Stack Overflow is a good signal for a design defect that could otherwise go unnoticed. I'm still in favor of fixing this bug, but it can wait after the release.

- Issue History
Date Modified Username Field Change
2009-03-16 10:35 shinwell New Issue
2009-04-29 15:41 doligez Status new => acknowledged
2012-06-20 14:31 varobert File Added: hashtbl.patch
2012-06-20 14:34 varobert Note Added: 0007590
2012-07-04 16:55 doligez Priority normal => high
2012-07-04 16:55 doligez Status acknowledged => confirmed
2012-07-04 17:18 doligez Target Version => 4.00.0+dev
2012-07-04 17:33 gasche Assigned To => gasche
2012-07-04 17:33 gasche Status confirmed => assigned
2012-07-12 14:28 gasche Note Added: 0007730
2012-07-12 14:28 gasche Target Version 4.00.0+dev => 4.01.0+dev
2012-07-27 12:49 frisch Category OCaml general => OCaml standard library
2012-07-31 13:36 doligez Target Version 4.01.0+dev => 4.00.1+dev
2012-09-17 13:51 doligez Target Version 4.00.1+dev => 4.00.2+dev


Copyright © 2000 - 2011 MantisBT Group
Powered by Mantis Bugtracker