问题描述
有没有人有比我有关正前pressions更多的知识知道如何分割HTML code,使所有代码和所有字分开即
< P>有些内容< A HREF =www.test.com>将链接< / A>< / P>
时的分隔是这样的:
=阵{[0] =>中< P>中,
[1] =>中的一些,
[2] =>中的内容,
[3] =>中与所述; A HREF =www.test.com'>中
[4] =>中A
[5] =>中通,
[6] =>中与所述; / A>中,
[7] =>中与所述; / P>中
我一直在使用preg_split到目前为止,并有可能成功,也成功地分裂用空白字符串或标签分割 - 但所有的内容是一个数组元素,当我EED这是分裂
任何人都帮我吗?
preg_split不应在这种情况下使用。尝试preg_match_all:
$文字='< P>有些内容< A HREF =www.test.com>将链接< / A>< / P>';
preg_match_all('/< ^>] ++盐| [^<> \\ S] ++ /',$文字$令牌);
的print_r($令牌);
输出:
阵列
(
[0] =>排列
(
[0] => &所述p为H.;
[1] =>一些
[2] =>内容
[3] => &所述; A HREF =www.test.com>
[4] =>一个
[5] =>链接
[6] => &所述; / A>
[7] => &所述; / P>
))
我以为你忘了包括'A'
在'链接'
在你的例子。
意识到,当你的HTML中包含<或>的并不是作为开始或结束的标签,正则表达式会搞乱的东西涨得厉害! (因此警告)
Does anyone with more knowledge than me about regular expressions know how to split up html code so that all tags and all words are seperated ie.
<p>Some content <a href="www.test.com">A link</a></p>
Is seperated like this:
array = { [0]=>"<p>",
[1]=>"Some",
[2]=>"content",
[3]=>"<a href='www.test.com'>,
[4]=>"A",
[5]=>"Link",
[6]=>"</a>",
[7]=>"</p>"
I've been using preg_split so far and have either successfully managed to split the string by whitespace or split by tags - but then all the content is in one array element when I eed this to be split to.
Anyone help me out?
preg_split shouldn't be used in that case. Try preg_match_all:
$text = '<p>Some content <a href="www.test.com">A link</a></p>';
preg_match_all('/<[^>]++>|[^<>\s]++/', $text, $tokens);
print_r($tokens);
output:
Array
(
[0] => Array
(
[0] => <p>
[1] => Some
[2] => content
[3] => <a href="www.test.com">
[4] => A
[5] => link
[6] => </a>
[7] => </p>
)
)
I assume you forgot to include the 'A'
in 'A link'
in your example.
Realize that when your HTML contains < or >'s not meant as the start or end of tags, regex will mess things up badly! (hence the warnings)
这篇关于分裂HTML code标签和内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!