问题描述
我正在尝试删除CKEditor插入到说明框中的所有空< p>
标签,但它们似乎有所不同。可能性似乎是:
I'm trying to remove all empty <p>
tags CKEditor is inserting in to a description box but they all seem to vary. The possibilities seem to be:
<p></p>
<p>(WHITESPACE)</p>
<p> </p>
<p><br /></p>
<p>(NEWLINE) </p>
<p>(NEWLINE)<br /><br />(NEWLINE) </p>
有了这些可能性,可能会有任意数量的空格,& nbsp ;
和< br />
标记位于段落之间,并且在一个段落中可能存在每种标记。
With these possibilities, there could be any amount of whitespace,
and <br />
tags in between the paragraphs, and there could be some of each kind in one paragraph.
我也不确定< br />
标签,从我看到的可能是< br />
,< br />
或< br>
。
I'm also not sure about the <br />
tag, from what I've seen it could be <br />
, <br/>
or <br>
.
我一直在寻找类似的答案,但在所有答案中,我都发现它们似乎只能满足其中一个条件情况并非一次全部发生。我想简单地说,我要问的是,是否可以使用正则表达式从某些没有任何HTML的HTML中删除所有< p>
标记
I've searched SO for a similar answer but of all the answers I've seen they all seem to cater for just one of these cases, not all at once. I guess in simple terms what I'm asking is, Is there a regular expression I can use to remove all <p>
tags from some HTML that don't have any alphanumeric text or symbols/punctuation in them?
推荐答案
嗯,与我不建议使用正则表达式解析HTML的建议相抵触,我写道
Well, in conflict with my suggestion not to parse HTML with regexes, I wrote up a regex to do just that:
"#<p>(\s| |</?\s?br\s?/?>)*</?p>#"
这将正确匹配:
<p></p>
<p> </p> <!-- ([space]) -->
<p> </p> <!-- (That's a [tab] character in there -->
<p> </p>
<p><br /></p>
<p>
</p>
<p>
<br /><br />
</p>
功能:
# / --> Regex start
# <p> --> match the opening <p> tag
# ( --> group open.
# \s --> match any whitespace character (newline, space, tab)
# | --> or
# --> match
# | --> or
# </?\s?br\s?/?> --> match the <br> tag
# )* --> group close, match any number of any of the elements in the group
# </?p> --> match the closing </p> tag ("/" optional)
# / --> regex end.
这篇关于PHP RegEx删除空的段落标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!