本文介绍了使用JSoup从复杂表中解析值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个包含以下html的表:
I have a table with the following html:
<TABLE class=data-table cellSpacing=0 cellPadding=0>
<TBODY>
<TR>
<TD colSpan=4><A id=accounting name=accounting></A>
<H3>Accounting</H3></TD></TR>
<TR>
<TH class=data-tablehd align=left>FORM NO.</TH>
<TH class=data-tablehd align=left>TITLE</TH>
<TH class=data-tablehd align=right>Microsoft</TH>
<TH class=data-tablehd align=right>Acrobat</TH></TR>
<TR>
<TD><A id=1008ft name=1008ft>SF 1008-FT</A></TD>
<TD>Work for Others Funding Transfer Between Projects for an Agreement</TD>
<TD align=right><A
href="https://someurl1"
target=top>MS Word</A></TD>
<TD align=right><A
href="https://someurl2"
target=top>PDF </A></TD></TR>
...
我需要解析< TR> ;
数据类似于
SF 1008-FT, Work for Others ... an Agreement, https://someurl1, https://someurl2
我尝试使用以下代码:
URL formURL = new URL("http://urlToParse");
Document doc = Jsoup.parse(formURL, 3000);
Element table = doc.select("TABLE[class = data-table]").first();
Iterator<Element> ite = table.select("td[colSpan=4]").iterator();
while(ite.next() != null) {
System.out.println(ite.next().text());
}
然而,这只会返回返回顶部,并且会在整个过程中返回一些不同的标题表。
However this only returns the "back to Top" and some different headings located throughout the table.
有人可以帮我写正确的JSoup代码来解析我需要的信息吗?
Can someone help me write the correct JSoup code to parse the information I need?
推荐答案
我发现解决方案对类似的线程进行了一些小修改。提供解决方案的代码如下:
I found the solution with some small modification to a similar thread. The code that provides the solution is given below:
for (Element table : doc.select("table")) {
for (Element row : table.select("tr")) {
Elements tds = row.select("td");
formNumber = tds.get(0).text();
title = tds.get(1).text();
link1 = tds.get(2).select("a[href]").attr("href");
link2 = tds.get(3).select("a[href]").attr("href");
}
}
这篇关于使用JSoup从复杂表中解析值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!