<?xml version="1.0" encoding="ISO-8859-1"?>

<!DOCTYPE message PUBLIC
  "-//MLarc//DTD MLarc output files//EN"
  "../../mlarc.dtd"[
  <!ATTLIST message
    listname CDATA #REQUIRED
    title CDATA #REQUIRED
  >
]>

  <?xml-stylesheet href="../../mlarc.xsl" type="text/xsl"?>


<message 
  url="2002/12/e51b101f8c02ee62799b0ef474bb2852"
  from="Markus Mottl &lt;markus@o...&gt;"
  author="Markus Mottl"
  date="2002-12-29T18:41:33"
  subject="Re: [Caml-list] PCRE"
  prev="2002/12/3edc01c9d06cc487b15cb8d221e733b7"
  next="2002/12/796b85eddae20a3dcf1ca2b3f4c5d4a1"
  prev-in-thread="2002/12/3edc01c9d06cc487b15cb8d221e733b7"
  prev-thread="2002/12/b386e4555a6a2f261a376b788e000fca"
  next-thread="2002/12/796b85eddae20a3dcf1ca2b3f4c5d4a1"
  root="../../"
  period="month"
  listname="caml-list"
  title="Archives of the Caml mailing list">

<thread subject="[Caml-list] PCRE">
<msg 
  url="2002/12/3edc01c9d06cc487b15cb8d221e733b7"
  from="Oleg &lt;oleg_inconnu@m...&gt;"
  author="Oleg"
  date="2002-12-29T15:42:57"
  subject="[Caml-list] PCRE">
<msg 
  url="2002/12/e51b101f8c02ee62799b0ef474bb2852"
  from="Markus Mottl &lt;markus@o...&gt;"
  author="Markus Mottl"
  date="2002-12-29T18:41:33"
  subject="Re: [Caml-list] PCRE">
</msg>
</msg>
</thread>

<contents>
On Sun, 29 Dec 2002, Oleg wrote:
&gt; I'm new to PCRE. Can anyone explain to me why the output of
&gt; 
&gt; # open Pcre;;
&gt; # version;;
&gt; - : string = "3.4 22-Aug-2000"
&gt; # full_split ~pat:"^(\\w+)(,(\\w+))*$" "a,b,c,d";;
&gt; - : Pcre.split_result list =
&gt; [Delim "a,b,c,d"; Group (1, "a"); Group (2, ",d"); Group (3, "d")]
&gt; 
&gt; does not contain Group(3, "b") and Group(3, "c") ?

Note that using "split" instead of "full_split" produces this list:

  - : string list = [""; "a"; ",d"; "d"]

This is absolutely correct Perl-behaviour.

"full_split" is actually the same, but it also allows you to access
matched substrings. Grouped subpatterns can only capture substrings once
(i.e. the last one if several are possible)!

&gt; Similarly, I expected 
&gt; 
&gt; # full_split ~pat:"S(a\\d)+" "Sa1a2";;
&gt; - : Pcre.split_result list = [Delim "Sa1a2"; Group (1, "a2")]
&gt; 
&gt; to produce [Delim "Sa1a2"; Group(1, "a1"); Group (1, "a2")]
&gt; 
&gt; The above uses the latest pcre-ocaml-4.31.0.

The same applies here: the behaviour of PCRE-OCaml is correct
wrt. Perl-semantics.

Regards,
Markus Mottl

-- 
Markus Mottl                                             markus@oefai.at
Austrian Research Institute
for Artificial Intelligence                  http://www.oefai.at/~markus
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

</contents>

</message>

