Warm tip: This article is reproduced from serverfault.com, please click

grep regex

GREP Regex not working properly, but my regex is correct

发布于 2020-02-18 22:38:00

Hopefully this is a simple mistake I am making, I am fairly new to regex in general. Basically I am trying to extract the name of a website from a text file.

myfile.txt example:

Hello please enjoy your stay at%sbananas.com%sfor the rest of the day. Bye now!

I am trying to extract only the word bananas from this. My regex is as follows:

/(?<=m%s)(.*?)(?=\.com)/

Using regexr online it works just fine but in GREP code I just can't figure out how to get this to work properly. It doesn't return any results. I have tried several variants of the following:

grep "/(?<=m%s)(.*?)(?=\.com)/" myfile.txt
grep -E "/(?<=m%s)(.*?)(?=\.com)/" myfile.txt
grep '/(?<=m%s)(.*?)(?=\.com)/' myfile.txt
grep "(?<=m%s)(.*?)(?=\.com)" myfile.txt
grep '(?<=m%s)(.*?)(?=\.com)' myfile.txt

Nothing seems to work. I would love if someone could point me in the right direction.

Questioner

Jason Waltz

Viewed

0

steffen 2020-02-19 07:46:49

The problem with regular expressions in grep and other Unix tools is that they usually support one, two or three different kinds of regular expressions. These are:

Basic regular expressions (BRE)
Extended regular expressions (ERE or EREG)
Perl compatible regular expressions (PCRE or PREG)

Your pattern is in PCRE syntax, therefore you need to identify your pattern as one (using -P). Note that I also removed the m between = and % (I don't know what that was supposed to do).

grep -Po "(?<=%s)(.*?)(?=\.com)" myfile.txt

With -o, you say you only want to print the matching part. My grep man page declares PCRE in grep as experimental so there probably might be cases where you'd get a segmentation fault or where the evaluation takes unusually much time.

Yunnosch 2020-02-18 23:34:24

And the difference of the deleted m in the lookbehind condition .... ? (?<=m%s) != (?<=%s)

steffen 2020-02-18 23:35:50

I actually don't know what that m was supposed to do.

Yunnosch 2020-02-18 23:37:15

But you deleted it. Note that I totally agree wiht deleting it, because I do not see how a regex including it could ever match. But the fact that it is there is enough explanation for it not working and makes the whole question mostly off-topic as "not reproduceable/typo". Just mention it as necessarily deleted, even if it is only a typo.

Yunnosch 2020-02-19 00:00:52

Note, there is an alternative, using \K. I thought that would allow to make a regex without PCRE support, but it seems I was wrong. It does work (tested on regex101.com), but is also PCRE, according to riptutorial.com/regex/topic/1338/match-reset---k I am not going to make a separate answer for this. Feel free to add this info to your answer, to make it more complete.

steffen 2020-02-19 00:06:27

@Yunnosch Yes, I know (see stackoverflow.com/a/52796549/845034). But I don't get the point here. There are always more solutions to problems. I guess, the OP just wanted to know, why is regex did not work.

热门帖子

1

卷死同行 gpt-4o 模型 1.4 折中转接近官网 3.5 的价格！

2

各位大佬好，我是一名大学生，想请教一下大家有没有什么适合大学生的赚钱小项目？我深知赚钱不易，所以想在不影响学业的前提下，找一些小项目来赚点零花钱。希望各位大佬能不吝赐教，分享一些你们的经验和建议。谢谢大家啦！

3

虚心求教，数据量上亿的爬虫数据用什么该用什么数据库呢

4

联通推出了更便宜的 eSIM iPad 套餐

5

坐标深圳，收台主机，不急

6

google doc如何快速插入日期时间？

7

最近三年面了三百多人，给程序员和面试官们分享一下我的感受

8

求助-我想低成本批量搭建美国 ip 的 socks5 代理，有什么好的方式吗？

9

7 年 iOS， 2 年 Java

10

该换手机了，消息推送延时短的手机有哪些呢？

热门github

1

A multi-platform library for OpenGL, OpenGL ES, Vulkan, window and input

2

Dev tool that writes scalable apps from scratch while the developer oversees the implementation

3

shadcn/ui, but for Svelte. ✨

4

The Python Risk Identification Tool for generative AI (PyRIT) is an open access automation framework to empower security professionals and machine learning engineers to proactively find risks in their generative AI systems.

5

Performance-portable, length-agnostic SIMD with runtime dispatch

6

ZK Credo

7

OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement

8

Joplin - the secure note taking and to-do app with synchronisation capabilities for Windows, macOS, Linux, Android and iOS.

9

Mamba is a new state space model architecture showing promising performance on information-dense data such as language modeling, where previous subquadratic models fall short of Transformers. It is based on the line of progress on structured state space models, with an efficient hardware-aware design and implementation in the spirit of FlashAttention.

10

This repository contains System Design resources which are useful while preparing for interviews and learning Distributed Systems

11

Curso para aprender el lenguaje de programación Python desde cero y para principiantes. 75 clases, 37 horas en vídeo, código, proyectos y grupo de chat. Fundamentos, frontend, backend, testing, IA...

12

🎓 Path to a free self-taught education in Computer Science!

13

1️⃣🐝🏎️ The One Billion Row Challenge -- A fun exploration of how quickly 1B rows from a text file can be aggregated with Java

14

A collective list of free APIs

15

📚 Freely available programming books