diff options
Diffstat (limited to 'srclib/pcre/doc/pcrecompat.3')
-rw-r--r-- | srclib/pcre/doc/pcrecompat.3 | 121 |
1 files changed, 0 insertions, 121 deletions
diff --git a/srclib/pcre/doc/pcrecompat.3 b/srclib/pcre/doc/pcrecompat.3 deleted file mode 100644 index 6a853e072a..0000000000 --- a/srclib/pcre/doc/pcrecompat.3 +++ /dev/null @@ -1,121 +0,0 @@ -.TH PCRE 3 -.SH NAME -PCRE - Perl-compatible regular expressions -.SH "DIFFERENCES BETWEEN PCRE AND PERL" -.rs -.sp -This document describes the differences in the ways that PCRE and Perl handle -regular expressions. The differences described here are with respect to Perl -5.8. -.P -1. PCRE does not have full UTF-8 support. Details of what it does have are -given in the -.\" HTML <a href="pcre.html#utf8support"> -.\" </a> -section on UTF-8 support -.\" -in the main -.\" HREF -\fBpcre\fP -.\" -page. -.P -2. PCRE does not allow repeat quantifiers on lookahead assertions. Perl permits -them, but they do not mean what you might think. For example, (?!a){3} does -not assert that the next three characters are not "a". It just asserts that the -next character is not "a" three times. -.P -3. Capturing subpatterns that occur inside negative lookahead assertions are -counted, but their entries in the offsets vector are never set. Perl sets its -numerical variables from any such patterns that are matched before the -assertion fails to match something (thereby succeeding), but only if the -negative lookahead assertion contains just one branch. -.P -4. Though binary zero characters are supported in the subject string, they are -not allowed in a pattern string because it is passed as a normal C string, -terminated by zero. The escape sequence \e0 can be used in the pattern to -represent a binary zero. -.P -5. The following Perl escape sequences are not supported: \el, \eu, \eL, -\eU, and \eN. In fact these are implemented by Perl's general string-handling -and are not part of its pattern matching engine. If any of these are -encountered by PCRE, an error is generated. -.P -6. The Perl escape sequences \ep, \eP, and \eX are supported only if PCRE is -built with Unicode character property support. The properties that can be -tested with \ep and \eP are limited to the general category properties such as -Lu and Nd. -.P -7. PCRE does support the \eQ...\eE escape for quoting substrings. Characters in -between are treated as literals. This is slightly different from Perl in that $ -and @ are also handled as literals inside the quotes. In Perl, they cause -variable interpolation (but of course PCRE does not have variables). Note the -following examples: -.sp - Pattern PCRE matches Perl matches -.sp -.\" JOIN - \eQabc$xyz\eE abc$xyz abc followed by the - contents of $xyz - \eQabc\e$xyz\eE abc\e$xyz abc\e$xyz - \eQabc\eE\e$\eQxyz\eE abc$xyz abc$xyz -.sp -The \eQ...\eE sequence is recognized both inside and outside character classes. -.P -8. Fairly obviously, PCRE does not support the (?{code}) and (?p{code}) -constructions. However, there is support for recursive patterns using the -non-Perl items (?R), (?number), and (?P>name). Also, the PCRE "callout" feature -allows an external function to be called during pattern matching. See the -.\" HREF -\fBpcrecallout\fP -.\" -documentation for details. -.P -9. There are some differences that are concerned with the settings of captured -strings when part of a pattern is repeated. For example, matching "aba" against -the pattern /^(a(b)?)+$/ in Perl leaves $2 unset, but in PCRE it is set to "b". -.P -10. PCRE provides some extensions to the Perl regular expression facilities: -.sp -(a) Although lookbehind assertions must match fixed length strings, each -alternative branch of a lookbehind assertion can match a different length of -string. Perl requires them all to have the same length. -.sp -(b) If PCRE_DOLLAR_ENDONLY is set and PCRE_MULTILINE is not set, the $ -meta-character matches only at the very end of the string. -.sp -(c) If PCRE_EXTRA is set, a backslash followed by a letter with no special -meaning is faulted. -.sp -(d) If PCRE_UNGREEDY is set, the greediness of the repetition quantifiers is -inverted, that is, by default they are not greedy, but if followed by a -question mark they are. -.sp -(e) PCRE_ANCHORED can be used at matching time to force a pattern to be tried -only at the first matching position in the subject string. -.sp -(f) The PCRE_NOTBOL, PCRE_NOTEOL, PCRE_NOTEMPTY, and PCRE_NO_AUTO_CAPTURE -options for \fBpcre_exec()\fP have no Perl equivalents. -.sp -(g) The (?R), (?number), and (?P>name) constructs allows for recursive pattern -matching (Perl can do this using the (?p{code}) construct, which PCRE cannot -support.) -.sp -(h) PCRE supports named capturing substrings, using the Python syntax. -.sp -(i) PCRE supports the possessive quantifier "++" syntax, taken from Sun's Java -package. -.sp -(j) The (R) condition, for testing recursion, is a PCRE extension. -.sp -(k) The callout facility is PCRE-specific. -.sp -(l) The partial matching facility is PCRE-specific. -.sp -(m) Patterns compiled by PCRE can be saved and re-used at a later time, even on -different hosts that have the other endianness. -.P -.in 0 -Last updated: 09 September 2004 -.br -Copyright (c) 1997-2004 University of Cambridge. |