Warm tip: This article is reproduced from serverfault.com, please click

Regex: capture anything within parenthesis, also nested parenthesis

发布于 2020-11-30 11:52:53

I know this might seem as a duplicate question, but believe me I searched and tried but didn't find a matching solution so hopefully you can help me.

I'm trying to analyze some text output which is displayed as "key(value) key(value)" into a hash and I have succeeded in doing this but whenever my value contains parenthesis, the capture isn't complete and it only captures until the inner closing parenthesis.

regex used: (\S+?)\((.+?)\)

Here is an example with the text input ==> Regex101

First capture group is the key, second capture group should be the value. As you can see the SCYEXIT key with value 'mqconnectlog.so(LogExit)' only captures up until the inner closing parenthesis: 'mqconnectlog.so(LogExit'

i also tried some variations that had the same result:

(\S+?)\(([^)]+)\)
(\S+?)\(([^)]+(?=\)))\)

I think the biggest problem here is that I need to make both capture groups lazy because there are multiple 'key(value)' pairs on the same line, otherwise it would capture too much characters and also include characters from the next 'key(value)' pair from the same line.

Is there any way to solve this?

Questioner
Stijn De Schutter
Viewed
0
Shawn 2020-11-30 20:48:17

You can use a recursive regular expression (Assuming the parens are always going to be balanced): (\S+?)(\(((?:(?>[^()]+)|(?2))*)\)) is taken from perlre. See it in action at Regex101. First capture group is the key, second is the value with the outer parens, third is the value inside the parens.