Version française
Home     About     Download     Resources     Contact us    

This site is updated infrequently. For up-to-date information, please visit the new OCaml website at

Browse thread
[Caml-list] pcre
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: 2002-12-08 (13:52)
From: Markus Mottl <markus@o...>
Subject: Re: [Caml-list] pcre
On Sun, 08 Dec 2002, onlyclimb wrote:
> let s ="abcdbcd" ;;
> let t  = pcre_exec ~pat:"bc"  s ;;
> t = [| 1; 3; 0 |]
> what does the last zero mean?

The matching engine needs extra workspace. This wasn't mentioned in the
documentation (added now). You'll rarely need this function anyway. Better
use "exec", "exec_all" or "extract", "extract_all" for your purposes.

On Sun, 08 Dec 2002, onlyclimb wrote:
> Dose the int array returned by pcre_exec contain the offsets of all
> the matches or the first match from pos?

It contains the offsets of the first, whole match followed by the
offsets of matched subpatterns (introduced with parentheses in the
pattern string).

> However as i tried, it seems that it returned the first match offects,
> then why it return a int array  not a turple of int*int which refers to 
> (from, to) ?

Because there can be arbitrarily many subpatterns. E.g. try this:

  let t = pcre_exec ~pat:"a(bc)"  s ;;

  t = [|0; 3; 1; 3; 0; 0|]

The whole match ranges from character 0 to 3 (exclusive), the first
subgroup from 1 to 3. The remaining zeroes belong to the extra workspace.

Using "extract" will make things clearer. Then the result is:

  [|"abc"; "bc"|]

With "exec" you can extract strings of matched (sub)patterns more
efficiently if you do not want to access all of them.

Markus Mottl

Markus Mottl                                   
Austrian Research Institute
for Artificial Intelligence        
To unsubscribe, mail Archives:
Bug reports: FAQ:
Beginner's list: