问题描述
我尝试使用JSoup来获取此网址的内容,即image logo.png,并将其保存到文件中。到目前为止,我已使用JSoup连接到并获得文件。然后我找到了我正在寻找的图像的绝对网址,但现在我不知道如何获得实际图像。所以我希望有人能指出我正确的方向吗?无论如何我也可以使用Jsoup.connect(http://www.aw20.co.uk/images/logo.png)。get();获取图像?
I am try to use JSoup to get the contents of this url http://www.aw20.co.uk/images/logo.png, which is the image logo.png, and save it to a file. So far I have used JSoup to connect to http://www.aw20.co.uk and get a Document. I then went and found the absolute url for the image I am looking for, but now am not sure how to this to get the actual image. So I was hoping someone could point me in the right direction to do so? Also is there anyway I could use Jsoup.connect("http://www.aw20.co.uk/images/logo.png").get(); to get the image?
import java.io.IOException;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class JGet2 {
public static void main(String[] args) {
try {
Document doc = Jsoup.connect("http://www.aw20.co.uk").get();
Elements img = doc.getElementsByTag("img");
for (Element element : img) {
String src = element.absUrl("src");
System.out.println("Image Found!");
System.out.println("src attribute is: " + src);
if (src.contains("logo.png") == true) {
System.out.println("Success");
}
getImages(src);
}
}
catch (IOException e) {
e.printStackTrace();
}
}
private static void getImages(String src) throws IOException {
int indexName = src.lastIndexOf("/");
if (indexName == src.length()) {
src = src.substring(1, indexName);
}
indexName = src.lastIndexOf("/");
String name = src.substring(indexName, src.length());
System.out.println(name);
}
}
推荐答案
你如果您不想将其解析为HTML,可以使用Jsoup来获取任何URL并将数据作为字节获取。例如:
You can use Jsoup to fetch any URL and get the data as bytes, if you don't want to parse it as HTML. E.g.:
byte[] bytes = Jsoup.connect(imgUrl).ignoreContentType(true).execute().bodyAsBytes();
ignoreContentType(true)
已设置,否则设置为Jsoup将抛出一个异常,即内容不是HTML可解析的 - 在这种情况下可以,因为我们使用 bodyAsBytes()
来获取响应主体,而不是解析。
ignoreContentType(true)
is set because otherwise Jsoup will throw an exception that the content is not HTML parseable -- that's OK in this case because we're using bodyAsBytes()
to get the response body, rather than parsing.
查看了解更多信息详情。
Check the Jsoup Connection API for more details.
这篇关于使用JSoup将此URL的内容:http://www.aw20.co.uk/images/logo.png保存到文件中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!