本文介绍了分裂HTML code标签和内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有人有比我有关正前pressions更多的知识知道如何分割HTML code,使所有代码和所有字分开即

 < P>有些内容< A HREF =www.test.com>将链接< / A>< / P>

时的分隔是这样的:

  =阵{[0] =>中< P>中,
          [1] =>中的一些,
          [2] =>中的内容,
          [3] =>中与所述; A HREF =www.test.com'>中
          [4] =>中A
          [5] =>中通,
          [6] =>中与所述; / A>中,
          [7] =>中与所述; / P>中

我一直在使用preg_split到目前为止,并有可能成功,也成功地分裂用空白字符串或标签分割 - 但所有的内容是一个数组元素,当我E​​ED这是分裂

任何人都帮我吗?


解决方案

preg_split不应在这种情况下使用。尝试preg_match_all:

  $文字='< P>有些内容< A HREF =www.test.com>将链接< / A>< / P>';
preg_match_all('/< ^>] ++盐| [^<> \\ S] ++ /',$文字$令牌);
的print_r($令牌);

输出:

 阵列

    [0] =>排列
        (
            [0] => &所述p为H.;
            [1] =>一些
            [2] =>内容
            [3] => &所述; A HREF =www.test.com>
            [4] =>一个
            [5] =>链接
            [6] => &所述; / A>
            [7] => &所述; / P>
        ))

我以为你忘了包括'A''链接'在你的例子。

意识到,当你的HTML中包含<或>的并不是作为开始或结束的标签,正则表达式会搞乱的东西涨得厉害! (因此警告)

Does anyone with more knowledge than me about regular expressions know how to split up html code so that all tags and all words are seperated ie.

<p>Some content <a href="www.test.com">A link</a></p>

Is seperated like this:

array = { [0]=>"<p>",
          [1]=>"Some",
          [2]=>"content",
          [3]=>"<a href='www.test.com'>,
          [4]=>"A",
          [5]=>"Link",
          [6]=>"</a>",
          [7]=>"</p>"

I've been using preg_split so far and have either successfully managed to split the string by whitespace or split by tags - but then all the content is in one array element when I eed this to be split to.

Anyone help me out?

解决方案

preg_split shouldn't be used in that case. Try preg_match_all:

$text = '<p>Some content <a href="www.test.com">A link</a></p>';
preg_match_all('/<[^>]++>|[^<>\s]++/', $text, $tokens);
print_r($tokens);

output:

Array
(
    [0] => Array
        (
            [0] => <p>
            [1] => Some
            [2] => content
            [3] => <a href="www.test.com">
            [4] => A
            [5] => link
            [6] => </a>
            [7] => </p>
        )

)

I assume you forgot to include the 'A' in 'A link' in your example.

Realize that when your HTML contains < or >'s not meant as the start or end of tags, regex will mess things up badly! (hence the warnings)

这篇关于分裂HTML code标签和内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-18 12:12