


Basically I want to replace certain words (e.g. the word "tree" with the word "pizza") in sentences. Restriction: When the word that should be replaced is between double quotes, the replace should not be performed.


The tree is green. -> REPLACE tree WITH pizza
"The" tree is "green". -> REPLACE tree WITH pizza
"The tree" is green. -> DONT REPLACE
"The tree is" green. -> DONT REPLACE
The ""tree is green. -> REPLACE tree WITH pizza


Is it possible to do this with regular expressions? I would count the number of double quotes before the word and check if it is odd or even. But is this possible using preg_replace in php?




preg_replace("/tree/", "pizza", $sentence)


But the problem here is to implement the logic with the double quotes. I tried things like:

preg_replace("/[^"]tree/", "pizza", $sentence)


But this does not work, because it checks only if a double quote is in front of the word. But there are examples above where this check fails.Import is that I want to solve that problem with regex only.



Regular expression is not a tool that will do what you need for every job. You can use regular expression for this to a certain extent, but for all cases amongst nested quotes, it continues to get more complicated.


$text = preg_replace('/\btree\b(?![^"]*"(?:(?:[^"]*"){2})*[^"]*$)/i', 'pizza', $text);

请参见 Working demo


\b               the boundary between a word char (\w) and not a word char
 tree            'tree'
\b               the boundary between a word char (\w) and not a word char
(?!              look ahead to see if there is not:
 [^"]*           any character except: '"' (0 or more times)
  "              '"'
 (?:             group, but do not capture (0 or more times)
  (?:            group, but do not capture (2 times):
   [^"]*         any character except: '"' (0 or more times)
    "            '"'
  ){2}           end of grouping
 )*              end of grouping
 [^"]*           any character except: '"' (0 or more times)
 $               before an optional \n, and the end of the string
)                end of look-ahead

另一种选择是使用受控回溯,因为您可以在 php

Another option is to use controlled backtracking since your able to do this in php

$text = preg_replace('/"[^"]*"(*SKIP)(*FAIL)|\btree\b/i', 'pizza', $text);

请参见 Working demo


The idea is to skip content in quotations. I first match the quotation followed by any character except " followed by a quotation and then make the subpattern fail and force the regular expression engine to not retry the substring with an other alternative with (*SKIP) and (*FAIL) backtracking control verbs.


08-15 01:10