问题描述
我前一段时间在这里问过关于从字符串中匹配两个包装的< code> ..< / code>
标签的文本,并且它一直在运行直到有人在< code>
标签中包含其他HTML。
这就是我在做的到目前为止:
preg_match_all(!< code>([^<] *)< / code>! ,$ string,$ return_array);
任何人都可以改进这个常规exp。解决我的问题? :)
感谢高级!
我不得不同意这个可怕的正则表达式是邪恶的模因。为了直接提取目的,正则表达式通常是合适的。但是,如果你想处理格式不正确和嵌套的HTML,那么这不是一个没有显着问题的选项。
因此,我建议使用phpQuery或。它也很简单: print qp($ html) - > find(code) - > text() ;
I asked here a while ago about matching text inside of two wrapped <code>..</code>
tags from a string, and it's been working great until somebody had some other HTML wrapped inside the <code>
tags.
This is how I'm doing it so far:
preg_match_all("!<code>([^<]*)</code>!", $string, $return_array);
Could anybody improve this regular exp. to solve my problem? :)
Thanks in advanced!
This is one case where I have to agree with the dreaded regex are evil meme. For straightforward extraction purposes, regular expressions are often suitable. But if you want to process malformed and or nested HTML, it's not an option without significant fuss.
Hence I'd recommend using phpQuery or QueryPath for such occasions. It's also pretty simple:
print qp($html)->find("code")->text();
这篇关于正则表达式匹配包装HTML的HTML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!