本文介绍了如何用< span>包装文本的一部分或任何其他没有新的HTML结构被逃逸的HTML标签?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在匹配元素文本中的特定字符串,并希望将匹配的文本包裹在一个范围内,以便能够选择它并稍后应用修改,但是html实体已被转义.有没有一种方法可以将带有html标签的字符串包装起来,以使其转义?

I am matching a specific string in an element text, and want to wrap the matching text with a span to be able to select it and apply modifications later on, but the html entities are being escaped. Is there a way to wrap the string with html tags with it being escaped ?

我尝试使用unescapeEntities()​方法,但是在这种情况下不起作用.wrap()也不起作用.有关这些方法的参考,请检查 https://jsoup.org/apidocs/org/jsoup/parser/Parser.html

I tried using unescapeEntities()​, method but it doesn't work in this case.wrap() didn't work as well.for reference to those methods check https://jsoup.org/apidocs/org/jsoup/parser/Parser.html

当前代码:

for (Element div : doc.select("div")) {
    for (String input : listOfStrings) {
        if (div.ownText().contains(input)) {
            div.text(div.ownText().replaceFirst(input, "<span class=\"select-me\">" + input + "</span>"));
        }
    }
}

所需的输出

<div>some text <span class="select-me">matched string</span></div>

实际输出

<div>some text &lt;span class=&quot;select-me&quot;&gt;matched string&lt;/span&gt;</div>

推荐答案

根据您的问题和评论,您似乎只希望修改所选元素的直接文本节点,而无需修改所选文本的潜在内部元素的文本节点,因此

Based on your question and comments it looks like you only want to modify direct text-nodes of selected element without modifying text node of potential inner elements of selected text so in case of

<div>a b <span>b c</span></div> 

如果要修改b,我们只修改直接放置在<div>中的一个,而不修改在<span>中的一个.

if we want to modify b we only modify one directly placed in <div> but not one in <span>.

<div>a b <span>b c</span></div> 
       ^       ^----don't modify because it is in <span>, not *directly* in <div>
       |
     modify

不像<div> <span>等那样将文本视为ElementNode,但是在DOM中将其表示为TextNode,因此,如果我们具有<div> a <span>b</span> c </div>这样的结构,则其DOM表示将是

Text is not considered as ElementNode like <div> <span> etc, but in DOM it is represented as TextNode so if we have structure like <div> a <span>b</span> c </div> then its DOM representation would be

Element: <div>
├ Text: " a "
├ Element: <span>
│ └ Text: "b"
└ Text: " c "

如果我们想将部分文本包装<span>(或任何其他标签)中,我们将有效地分割单个TextNode

If we want to wrap portion of some text into <span> (or any other tag) we are effectively splitting singe TextNode

├ Text: "foo bar baz"

分为以下系列:

├ Text: "foo "
├ Element: <span>
│ └ Text: "bar"
└ Text: " baz"

要创建使用该想法的解决方案, TextNode API为我们提供了一套非常有限的工具,但是在可用的方法中,我们可以使用

To create solution which uses that idea TextNode API gives us very limited set of tools, but among available methods we can use

  • splitText(index) TextNode在其中保留拆分的左侧"并返回新的TextNode,该文本节点保留拆分的其余(右侧),就像TextNode node1TextNode node2 = node1.splitText(3); node1之后保存"foo bar"时将保存"foo"一样,而node2将保持" bar"并将被放置为node1
  • 之后的直接同级
  • wrap(htmlElement) (继承自Node超类)将TextNode包装在表示htmlElement的ElementNode中,例如node.wrap("<span class='myClass'>")的结果,将得到<span class='myClass>text from node</span>.
  • splitText(index) which modifies original TextNode leaving "left" side of the split in it and returns new TextNode which holds remaining (right) side of the split like if TextNode node1 holds "foo bar" after TextNode node2 = node1.splitText(3); node1 will hold "foo" while node2 will hold " bar" and will be placed as immediate sibling after node1
  • wrap(htmlElement) (inherited from Node superclass) which wraps TextNode in ElementNode representing htmlElement for instance node.wrap("<span class='myClass'>") will result in <span class='myClass>text from node</span>.

使用上面的工具",我们可以创建类似的方法

With above "tools" we can create method like

static void wrapTextWithElement(TextNode textNode, String strToWrap, String wrapperHTML) {

    while (textNode.text().contains(strToWrap)) {
        // separates part before strToWrap
        // and returns node starting with text we want
        TextNode rightNodeFromSplit = textNode.splitText(textNode.text().indexOf(strToWrap));

        // if there is more text after searched string we need to
        // separate it and handle in next iteration
        if (rightNodeFromSplit.text().length() > strToWrap.length()) {
            textNode = rightNodeFromSplit.splitText(strToWrap.length());
            // after separating remining part rightNodeFromSplit holds
            // only part which we ware looking for so lets wrap it
            rightNodeFromSplit.wrap(wrapperHTML);
        } else { // here we know that node is holding only text to wrap
            rightNodeFromSplit.wrap(wrapperHTML);
            return;// since textNode didn't change but we already handled everything
        }
    }
}

我们可以这样使用:

Document doc = Jsoup.parse("<div>b a b <span>b c</span> d b</div> ");
System.out.println("BEFORE CHANGES:");
System.out.println(doc);

Element id1 = doc.select("div").first();
for (TextNode textNode : id1.textNodes()) {
    wrapTextWithElement(textNode, "b", "<span class='x'>");
}

System.out.println();
System.out.println("AFTER CHANGES");
System.out.println(doc);

结果:

BEFORE CHANGES:
<html>
 <head></head>
 <body>
  <div>
   b a b 
   <span>b c</span> d b
  </div> 
 </body>
</html>

AFTER CHANGES
<html>
 <head></head>
 <body>
  <div>
   <span class="x">b</span> a 
   <span class="x">b</span> 
   <span>b c</span> d 
   <span class="x">b</span>
  </div> 
 </body>
</html>

这篇关于如何用&lt; span&gt;包装文本的一部分或任何其他没有新的HTML结构被逃逸的HTML标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-30 04:49