

grep 从命令行使用时不能输入原始"字符串,因为某些字符需要转义才能不被视为文字.例如:

grep can't be fed "raw" strings when used from the command-line, since some characters need to be escaped to not be treated as literals. For example:

$ grep '(hello|bye)' # WON'T MATCH 'hello'

我使用 printf 来自动转义字符串:

I was using printf to auto-escape strings:

$ printf '%q' '(some|group)

这会生成字符串的 bash 转义版本,并使用反引号,可以轻松地将其传递给 grep 调用:

This produces a bash-escaped version of the string, and using backticks, this can easily be passed to a grep call:

$ grep `printf '%q' '(a|b|c)'`


However, it's clearly not meant for this: some characters in the output are not escaped, and some are unnecessarily so. For example:

$ printf '%q' '(^#)'

^ 字符在传递给 grep 时不应转义.

The ^ character should not be escaped when passed to grep.

是否有一个 cli 工具接受一个原始字符串并返回字符串的 bash 转义版本,它可以直接用作 grep 的模式?如果没有,我如何在纯 bash 中实现这一点?

Is there a cli tool that takes a raw string and returns a bash-escaped version of the string that can be directly used as pattern with grep? How can I achieve this in pure bash, if not?


如果你试图让 grep 使用扩展正则表达式语法,那么这样做的方法是使用 grep-E(又名 egrep).您还应该了解 grep -F(又名 fgrep),以及在较新版本的 GNU Coreutils 中,grep -P.

If you are attempting to get grep to use Extended Regular Expression syntax, the way to do that is to use grep -E (aka egrep). You should also know about grep -F (aka fgrep) and, in newer versions of GNU Coreutils, grep -P.

背景:最初的grep 有一组相当小的正则表达式操作符;它是 Ken Thompson 最初的正则表达式实现.后来开发了具有扩展曲目的新版本,出于兼容性原因,使用了不同的名称.使用 GNU grep,只有一个二进制文件,如果作为 grep 调用,它可以理解传统的基本 RE 语法,如果作为 egrep 调用,它可以理解 ERE.egrep 中的一些结构可以在 grep 中使用,通过使用反斜杠转义来引入特殊含义.

Background: The original grep had a fairly small set of regex operators; it was Ken Thompson's original regular expression implementation. A new version with an extended repertoire was developed later, and for compatibility reasons, got a different name. With GNU grep, there is only one binary, which understands the traditional, basic RE syntax if invoked as grep, and ERE if invoked as egrep. Some constructs from egrep are available in grep by using a backslash escape to introduce special meaning.

随后,Perl 编程语言进一步扩展了形式主义;大多数新人错误地期望 grep 也支持这种正则表达式方言.使用 grep -P,它可以;但这尚未在所有平台上得到广泛支持.

Subsequently, the Perl programming language has extended the formalism even further; this regex dialect seems to be what most newcomers erroneously expect grep, too, to support. With grep -P, it does; but this is not yet widely supported on all platforms.


So, in grep, the following characters have a special meaning: ^$[]*.

egrep中,以下字符也有特殊含义:()|+?{}.(用于重复的大括号不在原来的 egrep 中.)分组括号还可以使用 12 等进行反向引用

In egrep, the following characters also have a special meaning: ()|+?{}. (The braces for repetition were not in the original egrep.) The grouping parentheses also enable backreferences with 1, 2, etc.

在许多版本的 grep 中,您可以通过在 egrep 特殊字符之前放置一个反斜杠来获得 egrep 行为.还有一些特殊的序列,比如 .

In many versions of grep, you can get the egrep behavior by putting a backslash before the egrep specials. There are also special sequences like <>.

在 Perl 中,引入了大量额外的转义,如 w s d.在 Perl 5 中,正则表达式功能得到了显着扩展,具有非贪婪匹配 *? +? 等,非分组括号 (?:...)、前瞻、后视等

In Perl, a huge number of additional escapes like w s d were introduced. In Perl 5, the regex facility was substantially extended, with non-greedy matching *? +? etc, non-grouping parentheses (?:...), lookaheads, lookbehinds, etc.

...话虽如此,如果您真的想将 egrep 正则表达式转换为 grep 正则表达式 而无需调用任何外部进程, 为每个 egrep 特殊字符尝试 ${regex/pattern/substitution};但请注意,这不能正确处理字符类、否定字符类或反斜杠转义.

... Having said that, if you really do want to convert egrep regular expressions to grep regular expressions without invoking any external process, try ${regex/pattern/substitution} for each of the egrep special characters; but recognize that this does not handle character classes, negated character classes, or backslash escapes correctly.


08-04 12:38