我想问一个问题:如何删除所选标签
网站是www.yellowbook.com
我的代码是
for (int i = 1; i < 21; i++) {
String shopNameTemp = "";
String shopAddressTempA = "";
String shopAddressTempB = "";
String shopAddressTempC = "";
String shopAddressTempD = "";
String shopTelTemp = "";
String divName = "divInAreaSummary_" + String.valueOf(i);
Elements node = doc.select("li[id=" + divName);
shopNameTemp = node.first().select("a[class=fn]").toString();
shopAddressTempA = node.first().select("span[class=street-address]").toString();
shopAddressTempB = node.first().select("span[class=locality]").toString();
shopAddressTempC = node.first().select("span[class=region]").toString();
shopAddressTempD = node.first().select("span[class=postal-code]").toString();
shopTelTemp = node.first().select("div[class=call phone-number]").toString();
System.out.println("Name " + shopNameTemp);
System.out.println("Address" + shopAddressTempA + shopAddressTempB + shopAddressTempC + shopAddressTempD);
System.out.println("Tel " + shopTelTemp);
}
我的输出是:
Please input your category and location and Province...
auto repair,Seattle,WA
Name <#a class="fn" data-classid="690" href="/profile/76-station-mlk_1861635669.html" onclick="OmAdViewLeadClick('adsource: companyname', false, '8330', ';7;;;;evar33=inArea|evar34=16', 'auto repairing');" title="View more information about 76 Station MLK">76 Station MLK<#/a>
Address <#span itemprop="streetAddress" class="street-address">15 Avenue Nw<#/span><#span itemprop="addressLocality" class="locality">Seattle<#/span><#span itemprop="addressRegion" class="region">WA<#/span><#span itemprop="postalCode" class="postal-code">98102-9810<#/span>
Tel <#div class="call phone-number">
(206) 826-3263
<#/div>
我怎样才能得到
名称76 Station MLK
地址西澳大利亚州西雅图西北15大道98102-9810
电话(206)826-3263
PS。我使用删除,内容将被删除,但标记仍然存在
最佳答案
代替使用toString()
,使用Element的text()
方法仅提取文本,而不提取标签。
例如:
shopNameTemp = node.first().select("a[class=fn]").text();
shopAddressTempA = node.first().select("span[class=street-address]").text();
shopAddressTempB = node.first().select("span[class=locality]").text();
shopAddressTempC = node.first().select("span[class=region]").text();
shopAddressTempD = node.first().select("span[class=postal-code]").text();
shopTelTemp = node.first().select("div[class=call phone-number]").text();
当您将其打印到控制台时,应该会得到正确的文本。请注意,您可能必须手动在
+ " " +
,shopAddressTempA
等之间添加一些空格(例如shopAddressTempB
),否则所有这些空格都将打印而没有空格。我对此进行了测试,输出为:
Name 76 Station MLK
Address 2801 Martin Luther King Jr Way S Seattle WA 98144-6003
Tel (206) 722-4995
关于java - 如何解决执行搜寻器的JSOUP中的目标标记删除问题,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/20836443/