Next Previous Contents

9. Perl Compatible Regular Expression Module

This module provides an interface to the PCRE library. It may be loaded using require("pcre").

9.1 pcre_compile

Synopsis

Compile a PCRE regular expression

Usage

PCRE_Type pcre_compile (String_Type pattern [, Int_Type options])

Description

The pcre_compile function compiles a PCRE style regular expression and returns the result. The optional options argument may be used to provide addition information affecting the compilation of the pattern. Specifically, it is a bit-mapped value formed from the logical-or of zero or more of the following symbolic constants:

    PCRE_ANCHORED     Force the match to be at the start of a string
    PCRE_CASELESS     Matches are to be case-insensitive
    PCRE_DOLLAR_ENDONLY (See PCRE docs for more information)
    PCRE_DOTALL       The dot pattern matches all characters
    PCRE_EXTENDED     Ignore whitespace in the pattern
    PCRE_EXTRA        (See PCRE docs for features this activates)
    PCRE_MULTILINE    Treat the subject string as multi-lines
    PCRE_UNGREEDY     Make the matches not greedy
    PCRE_UTF8         Regard the pattern and subject strings as UTF-8
Many of these flags may be set within the pattern itself. See the PCRE library documentation for more information about the precise details of these flags and the supported regular expressions.

Upon success, this function returns a PCRE_Type object representing the compiled patterned. If compilation fails, a ParseError exception will be thrown.

See Also

pcre_exec, pcre_nth_match, pcre_nth_substr, pcre_matches

9.2 pcre_exec

Synopsis

Match a string against a compiled PCRE pattern

Usage

Int_Type pcre_exec(p, str [,pos [,options]]);

   PCRE_Type p;
   String_Type str;
   Int_Type pos, options;

Description

The pcre_exec function applies a pre-compiled pattern p to a string str and returns the result of the match. The optional third argument pos may be used to specify the point, as an offset from the start of the string, where matching is to start. The fourth argument, if present, may be used to provide additional information about how matching is to take place. Its value may be specified as a logical-or of zero or more of the following flags:

   PCRE_NOTBOL
        The first character in the string is not at the beginning of a line.
   PCRE_NOTEOL
        The last character in the string is not at the end of a line.
   PCRE_NOTEMPTY
        An empty string is not a valid match.
See the PCRE library documentation for more information about the meaning of these flags.

Upon success, this function returns a positive integer equal to 1 plus the number of so-called captured substrings. It returns 0 if the pattern fails to match the string.

See Also

pcre_compile, pcre_nth_match, pcre_nth_substr, pcre_matches

9.3 pcre_matches

Synopsis

Match a PCRE to a string and return the matches

Usage

String_Type[] = pcre_matches (regexp, str [,pcre_exec_options])

Description

This function combines the pcre_exec and pcre_nth_substr functions into simple to use function that matches the specified regular expression regexp to the string string and returns matched substrings as an array. If regexp is a string, the function will first compile the pattern using the pcre_compile function. The third parameter is optional and, if provided, will be passed as the options parameter to pcre_exec.

If no match is found, the function will return NULL.

Qualifiers

The following qualifiers are supported:

   options=int        Options to pass to pcre_compile
   offset=int         Start matching offset in bytes for pcre_exec

Example

After the execution of:

    str = "Error in file foo.c, line 127, column 10";
    pattern = "file ([^,]+), line (\\d+)";
    matches = pcre_matches (pattern, str);
the value of the matches variable will be the array:
    [ "file foo.c, line 127",
      "foo.c",
      127
    ]

See Also

pcre_compile, pcre_exec, pcre_nth_substr, pcre_nth_match

9.4 pcre_nth_match

Synopsis

Return the location of the nth match of a PCRE

Usage

Int_Type[2] pcre_nth_match (PCRE_Type p, Int_Type nth)

Description

The pcre_nth_match function returns an integer array whose values specify the locations of the beginning and end of the nth captured substrings of the most recent call to pcre_exec with the compiled pattern. A value of nth equal to 0 represents the substring representing the entire match of the pattern.

If the nth match did not take place, the function returns NULL.

Example

After the execution of:

    str = "Error in file foo.c, line 127, column 10";
    pattern = "file ([^,]+), line (\\d+)";
    p = pcre_compile (pattern);
    if (pcre_exec (p, str))
      {
         match_pos = pcre_nth_match (p, 0);
         file_pos = pcre_nth_match (p, 1);
         line_pos = pcre_nth_match (p, 2);
      }
match_pos will be set to [9,29], file_pos to [14,19,] and line_pos to [26,29]. These integer arrays may be used to extract the substrings matched by the pattern, e.g.,
     file = substr (str, file_pos[0]+1, file_pos[1]-file_pos[0]);
     line = str[[line_pos[0]:line_pos[1]-1]];
Alternatively, the function pcre_nth_substr may be used to get the matched substrings:
     file = pcre_nth_substr (p, str, 0);

See Also

pcre_compile, pcre_exec, pcre_nth_substr, pcre_matches

9.5 pcre_nth_substr

Synopsis

Extract the nth substring from a PCRE match

Usage

String_Type pcre_nth_substr (PCRE_Type p, String_Type str, Int_Type nth)

Description

This function may be used to extract the nth captured substring resulting from the most recent use of the compiled pattern p by the pcre_exec function. Unlike pcre_nth_match, this function returns the specified captured substring itself and not the position of the substring. For this reason, the subject string of the pattern is a required argument.

See Also

pcre_matches, pcre_compile, pcre_exec, pcre_nth_match

9.6 slang_to_pcre

Synopsis

Convert a S-Lang regular expression to a PCRE one

Usage

String_Type slang_to_pcre (String_Type pattern)

Description

This function may be used to convert a slang regular expression to a PCRE compatible one. The converted pattern is returned.

See Also

pcre_compile, string_match


Next Previous Contents