温馨提示:本文翻译自stackoverflow.com,查看原文请点击:regex - Zero-Length regexes and infinite matches?
regex

regex - 零长度正则表达式和无限匹配?

发布于 2020-03-28 23:20:49

现在,在尝试详细说明问题的答案时,我想了解零长度正则表达式的行为/含义。

我经常使用www.regexr.com作为游乐场来测试/调试/理解正则表达式中发生的事情。

因此,我们有一个最平淡的场景:

正则表达式是 a*

输入的字符串是dgwawa (事实上​​,这里的字符串是无关紧要的)

为什么这种行为表示此正则表达式将无限匹配,因为它匹配零次出现的前一个字符?

Why can't the result be 6 matches, one for each character position (since at every character, regardless of whether it is an a or not, there is a match, since zero matches is a match)?

How does it get into matching infinitely ? So it does not check/progress a character at a time?

I wonder how/where does it get itself into an infinite loop.

在此处输入图片说明

查看更多

查看更多

提问者
Veverke
被浏览
151
Wiktor Stribiżew 2020-01-31 17:33

You selected JavaScript regex flavor at regexr.com online regex tester. JavaScript regex engine does not move the index automatically when a pattern that can match an empty string is passed.

That is why when you need to emulate the behavior observed in .NET Regex.Matches, PHP preg_match_all, Python re.finditer, etc. you need to manually advance the index to test each position.

参见regex101.com测试

var re = /a*/g; 
var str = 'dgwawa';
var m;
 
while ((m = re.exec(str)) !== null) {
    if (m.index === re.lastIndex) {   // <- this part
        re.lastIndex++;               // <- here
    }                                 // <- is important
    document.body.innerHTML += "'" + m[0] + "'<br/>";
}

如果删除该if块,则会出现无限循环。

关于这一点,有两件事要提到:

  • 始终使用适合您的编程语言的在线正则表达式测试仪
  • 避免使用可以匹配空字符串的非锚定模式