Warm tip: This article is reproduced from serverfault.com, please click

How can I get a list of the words that have six or more consonants in a row using the grep command?

发布于 2020-11-20 12:52:22

I want to find a list of words that contain six or more consonants in a row from a number of text files.

I'm pretty new to the Unix terminal, but this is what I have tried:

cat *.txt | grep -Eo "\w+" | grep -i "[^AEOUIaeoui]{6}"

I use the cat command here because it will otherwise include the file names in the next pipe. I use the second pipe to get a list of all the words in the text files.

The problem is the last pipe, I want to somehow get it to grep 6 consonants in a row, it doesn't need to be the same one. I would know one way of solving the problem, but that would create a command longer that this entire post.

Questioner
doelie247
Viewed
0
Wiktor Stribiżew 2020-11-20 20:57:29

You can use

grep -hEio '[[:alpha:]]*[b-df-hj-np-tv-z]{6}[[:alpha:]]*' *.txt

Regex details

  • [[:alpha:]]* - any zero or more letter
  • [b-df-hj-np-tv-z]{6} - six English consonant letters on end
  • [[:alpha:]]* - any zero or more letter.

The grep options make the regex search case insensitive (i) and grep shows the matched texts only (with o) without displaying the filenames (h). The -E option allows the POSIX ERE syntax, else, if you do not specify it, you would need to escape {6} as \{6\},