问题描述
有些字符串与其他文本一起出现在一行中,这些字符串由开/闭引号分隔,如下所示.我试图找到一个正则表达式,将单词/短语与逗号匹配作为内部定界符(或整个外部定界的内容,如果在单个单词/短语的情况下没有逗号,则为整个外部定界的内容).例如,以下短语:
There are strings appearing in a line with other text which are delimited by opening and closing quote, like the ones below. I am trying to find a regex that would match the word/phrase with the comma as internal delimiter (or the whole externally delimited content if there is no comma as in the case of a single word/phrase). For example for these phrases:
‘verdichten’
‘verdichten, verstopfen’
‘dunkel, finster, wolkig’
‘fort sein, verloren sein, verloren’
‘von den Nymph ergriffen, verzückt, verrückt’
‘der sich halten kann, halten kann’
我想要的结果是:
[[verdichten]]
[[verdichten]], [[verstopfen]]
[[dunkel]], [[finster]], [[wolkig]]
[[fort sein]], [[verloren sein]], [[verloren]]
[[von den Nymph ergriffen]], [[verzückt]], [[verrückt]]
[[der sich halten kann]], [[halten kann]]
它应该可以在Notepad ++或EmEditor中工作.
It should work in Notepad++ or EmEditor.
我可以与(‘)(.+?)(’)
匹配,但是找不到如上所述的替换方法.
I can match with (‘)(.+?)(’)
but I cannot find a way to replace as described.
推荐答案
一种选择可能是利用\G
锚点和2个捕获组:
One option could be making use of the \G
anchor and 2 capturing groups:
(?:‘|\G(?!^))([^,\r\n’]+)(?=[^\r\n’]*’)(?:(,\h*)|’)
部分
-
(?:
非捕获组-
‘
匹配‘
-
|
或 -
\G(?!^)
在上一场比赛的末尾而不是在开始时断言
(?:
Non capturing group‘
Match‘
|
Or\G(?!^)
Assert position at the end of previous match, not at the start
-
[^,\r\n’]+
匹配1次以上所有字符的字符,除了,
或换行符
[^,\r\n’]+
Match 1+ times any char except,
or newline
-
(,\h*)|’
或者在组2 中捕获逗号和0+个水平空白字符,或者匹配’
(,\h*)|’
Either capture a comma and 0+ horizontal whitespace chars in group 2, or match’
在替换使用中:
[[$1]]$2
输出
[[verdichten]] [[verdichten]], [[verstopfen]] [[dunkel]], [[finster]], [[wolkig]] [[fort sein]], [[verloren sein]], [[verloren]] [[von den Nymph ergriffen]], [[verzückt]], [[verrückt]] [[der sich halten kann]], [[halten kann]]
这篇关于匹配定界字符之间的字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!
-