several tips about Regular Expressions
1. process for "greedy"
By default, the quantifiers are "greedy", that is, they
match as much as possible (up to the maximum number of per-
mitted times), without causing the rest of the pattern to
fail. The classic example of where this gives problems is in
trying to match comments in C programs. These appear between
the sequences /* and */ and within the sequence, individual
* and / characters may appear. An attempt to match C com-
ments by applying the pattern
/\*.*\*/
to the string
/* first command */ not comment /* second comment */
fails, because it matches the entire string due to the
greediness of the .* item.
However, if a quantifier is followed by a question mark,
then it ceases to be greedy, and instead matches the minimum
number of times possible, so the pattern
/\*.*?\*/
小结:
?与/U有类似功能,但同时出现彼此抵消
如下:
<?
$a = "asdf/*asdfaldsfasdf*/asfdasldf;kfldsj*/asfddsaf";
$pattern = "/\/\*.*?\*\//";
//$pattern = "/\/\*.*\*\//U";
//$pattern = "/\/\*.*?\*\//U";
preg_match($pattern,$a,$match);
print_r($match);
?>
2.Assertions
\w+(?=;)
matches a word followed by a semicolon, but does not include
the semicolon in the match, and
foo(?!bar)
matches any occurrence of "foo" that is not followed by
"bar". Note that the apparently similar pattern
小结:
(?!)只前向判断匹配,如bar(?!foo),而(?!foo)bar没有意义
(?<!)只后向判断匹配,如(?<!foo)bar,而foo(?<!bar)没有意义