问题描述
使用PHP和preg_match_all,我试图获取以下标记(以及标记)之间的所有HTML内容:
Using PHP and preg_match_all I'm trying to get all the HTML content between the following tags (and the tags also):
<p>paragraph text</p>
don't take this
<ul><li>item 1</li><li>item 2</li></ul>
don't take this
<table><tr><td>table content</td></tr></table>
我可以选择其中之一:
preg_match_all("(<p>(.*)</p>)siU", $content, $matches, PREG_SET_ORDER);
有没有办法获得所有
<p></p> <ul></ul> <table></table>
只包含一个preg_match_all的内容?我需要按照发现的顺序将它们显示出来,以便我可以回显内容,这才有意义.
content with a single preg_match_all? I need them to come out in the order they were found so I can echo the content and it will make sense.
因此,如果我对上述内容进行了preg_match_all,然后遍历$ matches数组,它将回显:
So if I did a preg_match_all on the above content then iterated through the $matches array it would echo:
<p>paragraph text</p>
<ul><li>item 1</li><li>item 2</li></ul>
<table><tr><td>table content</td></tr></table>
推荐答案
使用|
匹配一组字符串之一:p|ul|table
Use |
to match one of a group of strings: p|ul|table
使用后向引用来匹配适当的结束标记:\\2
,因为组(pl|ul|table)
包含第二个开始括号
Use backreferences to match the approriate closing tag: \\2
because the group (pl|ul|table)
includes the second opening parenthesis
将它们放在一起:
preg_match_all("(<(p|ul|table)>(.*)</\\2>)siU", $content, $matches, PREG_SET_ORDER);
这仅在您输入的html遵循非常严格的结构时才起作用.它不能在标签中有空格,也不能在标签中具有任何属性.当有任何嵌套时,它也会失败.考虑使用html解析器来完成适当的工作.
This is only going to work if your input html follows a very strict structure. It cannot have spaces in the tags, or have any attributes in tags. It also fails when there's any nesting. Consider using an html parser to do a proper job.
这篇关于使用PHP中的一个preg_match_all查找多个模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!