我想问一个问题:如何删除所选标签

网站是www.yellowbook.com

我的代码是

for (int i = 1; i < 21; i++) {
    String shopNameTemp = "";
    String shopAddressTempA = "";
    String shopAddressTempB = "";
    String shopAddressTempC = "";
    String shopAddressTempD = "";
    String shopTelTemp = "";
    String divName = "divInAreaSummary_" + String.valueOf(i);

    Elements node = doc.select("li[id=" + divName);

    shopNameTemp = node.first().select("a[class=fn]").toString();
    shopAddressTempA = node.first().select("span[class=street-address]").toString();
    shopAddressTempB = node.first().select("span[class=locality]").toString();
    shopAddressTempC = node.first().select("span[class=region]").toString();
    shopAddressTempD = node.first().select("span[class=postal-code]").toString();
    shopTelTemp = node.first().select("div[class=call phone-number]").toString();
    System.out.println("Name  " + shopNameTemp);
    System.out.println("Address" + shopAddressTempA + shopAddressTempB + shopAddressTempC + shopAddressTempD);
    System.out.println("Tel   " + shopTelTemp);

}


我的输出是:

Please input your category and location and Province...

auto repair,Seattle,WA


Name <#a class="fn" data-classid="690" href="/profile/76-station-mlk_1861635669.html" onclick="OmAdViewLeadClick('adsource: companyname', false, '8330', ';7;;;;evar33=inArea|evar34=16', 'auto repairing');" title="View more information about 76 Station MLK">76 Station MLK<#/a>

Address   <#span itemprop="streetAddress" class="street-address">15 Avenue Nw<#/span><#span itemprop="addressLocality" class="locality">Seattle<#/span><#span itemprop="addressRegion" class="region">WA<#/span><#span itemprop="postalCode" class="postal-code">98102-9810<#/span>
Tel   <#div class="call phone-number">
(206) 826-3263
<#/div>


我怎样才能得到


名称76 Station MLK

地址西澳大利亚州西雅图西北15大道98102-9810

电话(206)826-3263


PS。我使用删除,内容将被删除,但标记仍然存在

最佳答案

代替使用toString(),使用Element的text()方法仅提取文本,而不提取标签。

例如:

shopNameTemp = node.first().select("a[class=fn]").text();
shopAddressTempA = node.first().select("span[class=street-address]").text();
shopAddressTempB = node.first().select("span[class=locality]").text();
shopAddressTempC = node.first().select("span[class=region]").text();
shopAddressTempD = node.first().select("span[class=postal-code]").text();
shopTelTemp = node.first().select("div[class=call phone-number]").text();


当您将其打印到控制台时,应该会得到正确的文本。请注意,您可能必须手动在+ " " +shopAddressTempA等之间添加一些空格(例如shopAddressTempB),否则所有这些空格都将打印而没有空格。

我对此进行了测试,输出为:

Name  76 Station MLK
Address 2801 Martin Luther King Jr Way S Seattle WA 98144-6003
Tel   (206) 722-4995

关于java - 如何解决执行搜寻器的JSOUP中的目标标记删除问题,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/20836443/

10-11 22:36
查看更多