You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Original bug ID: 6989 Reporter:@mjambon Status: closed (set by @xavierleroy on 2017-02-16T14:18:33Z) Resolution: fixed Priority: normal Severity: minor Version: 4.02.3 Target version: 4.03.0+dev / +beta1 Fixed in version: 4.03.0+dev / +beta1 Category: otherlibs Monitored by:@gasche
Bug description
The current implementation of the str library uses a hardcoded limit of 32 capturing groups.
In practice this limit can be reached when regexps are generated, possibly aggravated by the lack of non-capturing groups (feature request #3969). Here we had a bug in the mikmatch_str reported by a user: mjambon/mikmatch#9
Of course it would be ideal to support an unlimited number of groups, but I'm not sure it's worth the effort given that people with advanced needs will just use PCRE.
Steps to reproduce
File groups.ml:
#load "str.cma";;
let re =
Str.regexp
"\\(\\)\\(\\)\\(\\)\\(\\)\\(\\)\\(\\)\\(\\)\\(\\)\\(\\)\\(\\)\
\\(\\)\\(\\)\\(\\)\\(\\)\\(\\)\\(\\)\\(\\)\\(\\)\\(\\)\\(\\)\
\\(\\)\\(\\)\\(\\)\\(\\)\\(\\)\\(\\)\\(\\)\\(\\)\\(\\)\\(\\)\
\\(x\\)\\(y\\)"
let s =
let input = "xy" in
if Str.string_match re input 0 then (
if Str.matched_group 31 input = "x" then
print_endline "x OK";
if Str.matched_group 32 input = "y" then
print_endline "y OK";
)
else
assert false
This gives us:
$ ocaml groups.ml
x OK
Exception: Invalid_argument "Str.matched_group".
It should have printed:
x OK
y OK
The text was updated successfully, but these errors were encountered:
Generated regexps can grow really fast in size, so increasing N from 32 to 100 may only benefit a few applications. If we wanted to go big, I'd suggest something like 10000, but it doesn't seem wise to statically allocate so much space.
We could play a trick that serves us well in other parts of the OCaml runtime system: use a statically-allocated array if the number of groups is small, and allocate the array dynamically otherwise. I'll look into this soon.
Original bug ID: 6989
Reporter: @mjambon
Status: closed (set by @xavierleroy on 2017-02-16T14:18:33Z)
Resolution: fixed
Priority: normal
Severity: minor
Version: 4.02.3
Target version: 4.03.0+dev / +beta1
Fixed in version: 4.03.0+dev / +beta1
Category: otherlibs
Monitored by: @gasche
Bug description
The current implementation of the str library uses a hardcoded limit of 32 capturing groups.
In practice this limit can be reached when regexps are generated, possibly aggravated by the lack of non-capturing groups (feature request #3969). Here we had a bug in the mikmatch_str reported by a user: mjambon/mikmatch#9
I suggest the following minimal changes:
Of course it would be ideal to support an unlimited number of groups, but I'm not sure it's worth the effort given that people with advanced needs will just use PCRE.
Steps to reproduce
File groups.ml:
This gives us:
It should have printed:
The text was updated successfully, but these errors were encountered: