Warm tip: This article is reproduced from serverfault.com, please click

sed to match pattern across a newline

发布于 2016-02-08 20:28:18

Here's my input:

<array>
    <string>extra1</string>
    <string>extra2</string>
    <string>Yellow
5</string>

Note: there's a space and newline between "Yellow" and "5"

I am piping this to sed:

| sed -n 's#.*<string>\(.*\)</string>#\1#p'

and I am getting the output:

extra1
extra2

I know that, because sed strips the newline from the end of each input line, the newline is not there to be matched - so that accounts for the result. I have read articles on adding the next line from the buffer, but I can't work out what I need to use in the pattern match to get this to work.

The output I want is:

extra1
extra2
Yellow 5

(In case it makes a difference, I am using a Mac, so I need this to work with - I think - the FreeBSD variant of sed.)

Of course, if another tool is better for what I want to achieve I am open to suggestions! Thanks!

Questioner
Lorccan
Viewed
0
Walter A 2016-02-09 06:25:09

Join the lines and tear them apart:

tr -d "\n" < file| grep -o "<string>[^<]*</string>"|sed 's/<string>\(.*\)<\/string>/\1/'