我有这样的HTML:
<table cellspacing='0' border='0' width='100%'>
<col align='left' />
<tr>
<td align='left'><font color='#FF0000'>Programming</font></td>
</tr>
</table>
<table cellspacing='0' border='0' width='100%'>
<col align='left' />
<col align='right' />
<tr>
<td align='left'><font color='#000000'>A1000</font></td>
<td align='right'><font color='#008000'>D.Rogers</font></td>
</tr>
</table>
它是本地存储的。我试图弄清楚如何对“ Programming”,“ A1000”和“ D.Rogers”进行数据抓取。如何使用Java和Jsoup做到这一点?
最佳答案
根据帖子中的示例:
String localHtml=" <table cellspacing=\'0\' border=\'0\' width=\'100%\'>\n"+
" <col align=\'left\' />\n"+
" <tr>\n"+
" <td align=\'left\'><font color=\'#FF0000\'>Programming</font></td>\n"+
" </tr>\n"+
" </table>\n"+
" <table cellspacing=\'0\' border=\'0\' width=\'100%\'>\n"+
" <col align=\'left\' />\n"+
" <col align=\'right\' />\n"+
" <tr>\n"+
" <td align=\'left\'><font color=\'#000000\'>A1000</font></td>\n"+
" <td align=\'right\'><font color=\'#008000\'>D.Rogers</font></td>\n"+
" </tr>\n"+
" </table>";
Document doc = Jsoup.parse(localHtml);
System.out.println(doc.select("font[color=#FF0000]").text());
System.out.println(doc.select("font[color=#000000]").text());
System.out.println(doc.select("font[color=#008000]").text());
输出值
Programming
A1000
D.Rogers
关于java - 数据抓取本地存储的HTML文件,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/36303495/