Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for extended regular expressions #5259

Closed
vicuna opened this issue Apr 28, 2011 · 1 comment
Closed

Support for extended regular expressions #5259

vicuna opened this issue Apr 28, 2011 · 1 comment

Comments

@vicuna
Copy link

vicuna commented Apr 28, 2011

Original bug ID: 5259
Reporter: gerd
Status: acknowledged (set by @damiendoligez on 2011-04-29T09:22:49Z)
Resolution: open
Priority: normal
Severity: feature
Platform: all
OS: all
OS Version: all
Version: 3.12.0
Category: otherlibs
Monitored by: @ygrek

Bug description

The idea is to have a new function Str.eregexp that parses the "extended" syntax. The main difference so far looks tiny, but has a lot of impact on the readability of code. Essentially, ( ) | no longer need a backslash. The reference standard here are POSIX extended regular expressions. It also defines the {bound} construct.

The extended syntax is nowadays used in most other programming languages. It feels like stepping back when using ocaml.

When introducing this, it might also be a good idea to define/reserve a generic way for extending the syntax later, e.g. for additional assertions and non-capturing groups, as suggested by other bug reporters. Using more backslash sequences does not look very elegant. More readable seems to be:

  • Character classes like [[:name:]]
  • Group with modifiers: (?switches:regexp), e.g.
    (?i:regexp) for a case-insensitive non-capturing group
  • Named option or assertion (*NAME)

Additional information

One would also need Str.equote.

A good reference comparing regexp flavors by syntax: http://www.regular-expressions.info/refflavors.html

@nojb
Copy link
Contributor

nojb commented Mar 15, 2019

The str library has many shortcomings (global state, clunky API, etc) and it is highly unlikely it will be developed further (in fact it will be split off the compiler distribution when it is feasible).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants