English version
Accueil     À propos     Téléchargement     Ressources     Contactez-nous    

Ce site est rarement mis à jour. Pour les informations les plus récentes, rendez-vous sur le nouveau site OCaml à l'adresse ocaml.org.

Browse thread
features of PCRE-OCaml
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2000-12-07 (08:04)
From: Markus Mottl <mottl@m...>
Subject: features of PCRE-OCaml

it seems that many people hadn't yet learnt about PCRE-OCaml (the
OCaml-interface to the PCRE-library) and have asked for more information
on the advantages as compared to the Str-library (or to Perl).

Here is a list of features as taken from the README:

  * The PCRE-library by Philip Hazel has been under development for
    quite some time now and is fairly advanced and stable. It implements
    just about all of the convenient functionality of regular expressions
    as one can find them in PERL. The higher-level functions written
    in OCaml (split, replace), too, are compatible to the corresponding
    PERL-functions (to the extent that OCaml allows). Most people find
    the syntax of PERL-style regular expressions more straightforward
    than the Emacs-style one used in the "Str"-module.

  * In contrast to PERL, the library creates DFAs (deterministic finite
    automata) instead of NFAs (nondeterministic finite automata). DFAs
    generally allow much faster pattern matching, because they never
    need to backtrack. Especially patterns with many alternations can
    see a great speedup.

  * It is reentrant - and thus thread safe. This is not the case with
    the "Str"-module of OCaml, which builds on the GNU "regex"-library.
    Using reentrant libraries also means more convenience for
    programmers. They do not have to reason about states in which the
    library might be in.

  * The high-level functions for replacement and substitution, they are
    all implemented in OCaml, are much faster than the ones of the
    "Str"-module. In fact, when compiled to native code, they even seem
    to be significantly faster than those of PERL (PERL is written in C).

    Somebody reported to me that he had tested OCaml with PCRE-OCaml
    against PERL and Python with several 100MB data that had to be
    matched/manipulated. Trusting his claims, the overall speed of the
    OCaml-version (native code) was 15 times faster than Perl and 45
    times faster than Python, which is probably also due to the high
    quality of the OCaml-compiler.

  * You can rely on the data returned being unique. In other terms:
    if the result of a function is a string, you can safely use
    destructive updates on it without having to fear side effects.

  * The interface to the library makes use of labels and default
    arguments to give you a high degree of programming comfort.

I hope this answers most questions!

Best regards,
Markus Mottl

Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl