I want to match "AB",if behind "A" is not B,only match "A"
I used perl regex:
A(*ACCEPT)??B
Strings "AB" is good match,but "AC" it not return "A".Why?
I know alternative,but I want to understand (*ACCEPT) with quantifier.
Is it something I understand wrong? Thanks for your help!
You pointed to the docs that say:
(*ACCEPT)
is the only backtracking verb that is allowed to be quantified because an ungreedy quantification with a minimum of zero acts only when a backtrack happens. Consider, for example,
(A(*ACCEPT)??B)C
where A, B, and C may be complex expressions. After matching "A", the matcher processes "BC"; if that fails, causing a backtrack,
(*ACCEPT)
is triggered and the match succeeds. In both cases, all but C is captured. Whereas(*COMMIT)
(see below) means "fail on backtrack", a repeated(*ACCEPT)
of this type means "succeed on backtrack".
However, (*ACCEPT)
doesn't seem to relate to backtracking, and you see it here in your example.
So, AC
can't be matched with A(*ACCEPT)??B
because:
A
in the pattern matches A
in the string,(*ACCEPT)??
is skipped first because it is lazily quantifiedB
can't match C
in the string, and fail occurs.You expected backtracking to occur, but (*ACCEPT)??
does not trigger backtracking.
A more helpful (*ACCPET)
usage example:
The only use case for
(*ACCEPT)
that I'm aware of is when the branches of an alternation are distributed into a later expression that is not required for all of the branches. For instance, suppose you want to match any of these patterns:BAZ
,BIZ
,BO
.You could simply write
BAZ|BIZ|BO
, but ifB
andZ
stand for complicated sub-patterns, you'll probably look for ways to factor theB
andZ
patterns. A first pass might give youB(?:AZ|IZ|O)
, but that solution doesn't factor theZ
. Another option would beB(?:A|I)Z|BO
, but it forces you to repeat theB
. This pattern allows you to factor both theB
and theZ
:B(?:A|I|O(*ACCEPT))Z
If he engine follows the O branch, it never matches
BOZ
because it returnsBO
as soon as(*ACCEPT)
is encountered—which is what we wanted.