我将html内容存储在数据库表中。在该html内容中,我想用链接标记替换“某些单词”。但是,如果链接标签中已经包含“SOME WORDS”,则应忽略它们。

例如
内容

<p>Lorem ipsum dolor SOME WORDS, consectetur adipiscing elit. <a href="http://example.com">SOME WORDS</a> elementum pharetra velit at cursus. Quisque blandit, nibh at eleifend ullamcorper</p>

输出应为
<p>Lorem ipsum dolor <a href="http://someurl">SOME WORDS</a>, consectetur adipiscing elit. <a href="http://example.com">SOME WORDS</a> elementum pharetra velit at cursus. Quisque blandit, nibh at eleifend ullamcorper</p>

如您所见,替换时,它应排除现有的链接文本。

非常感谢您提供一些正确入门的指导。

最佳答案

这是您可以使用DOMDocument而不是正则表达式解决的方法:

$contents = <<<EOS
<p>Lorem ipsum dolor SOME WORDS, consectetur adipiscing elit. <a href="http://example.com">SOME WORDS</a> elementum pharetra velit at cursus. Quisque blandit, nibh at eleifend ullamcorper</p>
EOS;

$doc = new DOMDocument;
libxml_use_internal_errors(true);
$doc->loadHTML($contents);
libxml_clear_errors();

$xp = new DOMXPath($doc);

// find all text nodes
foreach ($xp->query('//text()') as $node) {
        // make sure it's not inside an anchor
        if ($node->parentNode->nodeName !== 'a') {
                $node->nodeValue = str_replace(
                    'SOME WORDS',
                    'SOME OTHER WORDS',
                    $node->nodeValue
                );
        }
}
// DOMDocument creates a full document and puts your fragment inside a body tag
// So we enumerate the children and save their HTML representation
$body = $doc->getElementsByTagName('body')->item(0);
foreach ($body->childNodes as $node) {
        echo $doc->saveHTML($node);
}

10-05 20:43
查看更多