我想访问html文件中包含的表。这是我的代码:
import java.io.*;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import com.gargoylesoftware.htmlunit.html.HtmlTable;
import com.gargoylesoftware.htmlunit.html.*;
import com.gargoylesoftware.htmlunit.WebClient;
public class test {
public static void main(String[] args) throws Exception {
WebClient client = new WebClient();
HtmlPage currentPage = client.getPage("http://www.mysite.com");
client.waitForBackgroundJavaScript(10000);
final HtmlDivision div = (HtmlDivision) currentPage.getByXPath("//div[@id='table-matches-time']");
String textSource = div.toString();
//String textSource = currentPage.asXml();
FileWriter fstream = new FileWriter("index.txt");
BufferedWriter out = new BufferedWriter(fstream);
out.write(textSource);
out.close();
client.closeAllWindows();
}
}
该表的格式如下:
<div id="table-matches-time" class="">
<table class=" table-main">
但我得到这个错误:
Exception in thread "main" java.lang.ClassCastException: java.util.ArrayList cannot be cast to com.gargoylesoftware.htmlunit.html.HtmlDivision
at test.main(test.java:20)
我如何阅读这张表?
最佳答案
这有效(并向我返回一个csv文件;)):
import java.io.*;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import com.gargoylesoftware.htmlunit.html.HtmlTable;
import com.gargoylesoftware.htmlunit.html.HtmlTableRow;
import com.gargoylesoftware.htmlunit.html.*;
import com.gargoylesoftware.htmlunit.WebClient;
public class test {
public static void main(String[] args) throws Exception {
WebClient client = new WebClient();
HtmlPage currentPage = client.getPage("http://www.mysite.com");
client.waitForBackgroundJavaScript(10000);
FileWriter fstream = new FileWriter("index.txt");
BufferedWriter out = new BufferedWriter(fstream);
for (int i=0;i<2;i++){
final HtmlTable table = (HtmlTable) currentPage.getByXPath("//table[@class=' table-main']").get(i);
for (final HtmlTableRow row : table.getRows()) {
for (final HtmlTableCell cell : row.getCells()) {
out.write(cell.asText()+',');
}
out.write('\n');
}
}
out.close();
client.closeAllWindows();
}
}
关于java - 使用htmlunit访问html表,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/10110452/