问题描述
我试图从一但HTML表格每次我这样做,我得到的HREF标题数据,而不是URL时刮网址 - 有没有人如何可以解决/避免
I'm attempting to scrape a URL from a HTML table however every time I do so I get the HREF title data instead of the URL - does anyone how this can be resolved/avoided?
<table class="datagrid">
<tr>
<th>Number</th>
<th>Name</th>
<th>Sex</th>
<th>Location</th>
</tr>
<tr>
<td><a href="redirector.cfm?ID=93bd5121-7a3b-4a56-a576-f432e542047a&page=1&&lname=&fname=" title="501207593">501207593 </a></td>
<td>AARON, JUSTIN COLBY </td>
<td>M </td>
<td>Facility 1</td>
</tr>
<tr>
<td><a href="redirector.cfm?ID=c5629a92-7113-487c-ba9b-1e62203ab08d&page=1&&lname=&fname=" title="501302750">501302750 </a></td>
<td>AARONSON, CARY HOWARD </td>
<td>M </td>
<td>Facility 2</td>
</tr>
<tr>
<td><a href="redirector.cfm?ID=66d01768-5686-44eb-ac6a-16eb783f52d0&page=1&&lname=&fname=" title="501306284">501306284 </a></td>
<td>ABBOTT, LAUREA </td>
<td>F </td>
<td>Facility 3</td>
</tr>
来源:
public class MainActivity extends Activity {
TextView tv;
String url = "http://google.com";
String tr;
Document doc;
@Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
tv = (TextView) findViewById(R.id.TextView01);
new MyTask().execute(url);
}
private class MyTask extends AsyncTask<String, Void, String> {
ProgressDialog prog;
String title = "";
@Override
protected void onPreExecute() {
prog = new ProgressDialog(MainActivity.this);
prog.setMessage("Loading....");
prog.show();
}
@Override
protected String doInBackground(String... params) {
try {
doc = Jsoup.connect(params[0]).get();
Element tableElement = doc.select(".datagrid").first();
Elements tableRows = tableElement.select("tr");
for (Element row : tableRows) {
Elements cells = row.select("td");
if (cells.size() > 0) {
title = cells.get(0).text() + "; "
+ cells.get(1).text() + "; "
+ cells.get(2).text() + "; "
+ cells.get(3).text();
}
}
} catch (IOException e) {
e.printStackTrace();
}
return title;
}
@Override
protected void onPostExecute(String title) {
super.onPostExecute(title);
prog.dismiss();
tv.setText(title);
}
}
}
目前的结果:
501306284; ABBOTT,拉瑞尔; F ;设施3
501306284; ABBOTT, LAUREA; F ; Facility 3
期望的结果:
redirector.cfm ID = 66d01768-5686-44eb-ac6a-16eb783f52d0&放大器;页= 1&安培;&安培; L-NAME =安培; FNAME =称号=501306284; ABBOTT,拉瑞尔; F ;设施3
redirector.cfm?ID=66d01768-5686-44eb-ac6a-16eb783f52d0&page=1&&lname=&fname=" title="501306284; ABBOTT, LAUREA; F ; Facility 3
或更好,但...
预期的效果。
点击这里查看更多详情(小于-URL); ABBOTT,拉瑞尔; F ;设施3
Click HERE for more info (<-URL); ABBOTT, LAUREA; F ; Facility 3
推荐答案
您似乎刚开的:
cells.get(0).text()
我觉得这是你正在尝试做的。
I think this is what you are trying to do
cells.get(0).child(0).attr("href")
查看本。
这篇关于通过Android的URL格式刮痧数据/ jsoup的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!