问题描述
我有以下内容的文件。
< TD> $ {} dontReplaceMe $ ReplaceMe {dontReplaceMeEither}< / TD>
我要匹配'ReplaceMe如果是在td标签,但如果是在$ {...}前pression不是。
我能做到这一点与正则表达式?
目前有:
sed的'/\\${.*?ReplaceMe.*?}/!s/ReplaceMe/REPLACED/g数据。文本
这是不可能的。
正则表达式可用于 3类乔姆斯基语言(普通语言)。结果
然而,你的样品code是一个 2型乔姆斯基语言(上下文无关语言)。
pretty多少,只要任何一种嵌套(括号中)的情况发生了,你正在处理的上下文无关语言,这是不包括在正规前pressions。
有基本的没办法来定义在常规的一对x和y
的内前pression ,因为这会要求定期EX pression有某种栈,它不会(即功能上等同于一个有限状态自动机)
按brandizzi面临的挑战是找到一个正则表达式可能匹配至少微不足道的情况下结果
其实我这个(痛苦哈克)正则表达式想出了:
的perl -pe的/(?< =< TD&GT)((:(\\)* [^ {] *? )*)(ReplaceMe)(*)(= LT;?\\ / TD&GT)/ $ 1REPLACED $ 3 / g的'
它的不正确 (原文如此!)的匹配对于这些情况:
< TD> $ {} dontReplaceMe $ ReplaceMe {dontReplaceMeEither}< / TD>
&所述; TD> ReplaceMe $ {dontReplaceMeEither}< / TD>
&所述; TD> $ {} dontReplaceMe&ReplaceMe LT; / TD>
&所述; TD> ReplaceMe< / TD>
和失败,这一次 (嵌套乔姆斯基类型2,记住;))的:
< TD> $ {$ {} dontReplaceMe $ ReplaceMe {dontReplaceMeEither}}< / TD>
和它的不能代替多个匹配或者
< TD> ReplaceMe ReplaceMe< / TD>
&所述; TD> ReplaceMe $ {} dontReplaceMeEither&ReplaceMe LT; / TD>
获取领先 $
覆盖是棘手的部分。结果
这守信 /的的。
AGAIN:实验,永远不要使用本公司在生产code
!I have a file with the content below.
<td> ${ dontReplaceMe } ReplaceMe ${dontReplaceMeEither} </td>
I want to match 'ReplaceMe' if it is in the td tag, but NOT if it is in the ${ ... } expression.
Can I do this with regex?
Currently have:
sed '/\${.*?ReplaceMe.*?}/!s/ReplaceMe/REPLACED/g' data.txt
This is not possible.
Regex can be used for Type-3 Chomsky languages (regular language).
Your sample code however is a Type-2 Chomsky language (context-free language).
Pretty much as soon as any kind of nesting (brackets) is involved you're dealing with context free languages, which are not covered by regular expressions.
There is basically no way to define within a pair of x and y
in a regular expression, as this would require the regular expression to have some kind of stack, which it doesn't (being functionally equivalent to a finite state automaton).
Challenged by brandizzi to find a regex that might match at least trivial cases
I actually came up with this (painfully hacky) regex pattern:
perl -pe 's/(?<=<td>)((?:(?:\{.*?\})*[^{]*?)*)(ReplaceMe)(.*)(?=<\/td>)/$1REPLACED$3/g'
It does proper (sic!) matching for these cases:
<td> ${ dontReplaceMe } ReplaceMe ${dontReplaceMeEither} </td>
<td> ReplaceMe ${dontReplaceMeEither} </td>
<td> ${ dontReplaceMe } ReplaceMe </td>
<td> ReplaceMe </td>
And fails with this one (nesting is Chomsky Type-2, remember? ;) ):
<td>${ ${ dontReplaceMe } ReplaceMe ${dontReplaceMeEither} }</td>
And it can't replace multiple matches either:
<td> ReplaceMe ReplaceMe </td>
<td> ReplaceMe ${dontReplaceMeEither} ReplaceMe </td>
Getting the leading $
covered was the tricky part.
This and keeping Reginald/Reggy from crashing constantly while writing this beast.
AGAIN: EXPERIMENTAL, DO NOT EVER USE THIS IN PRODUCTION CODE!
这篇关于正则表达式 - 嵌套模式 - 外模式中,但不包括内部图案的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!