本文介绍了如何向Jsoup添加代理支持(HTML解析器)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是Java的新手,我的第一个任务是解析大约10,000个网址并提取一些信息,为此我使用Jsoup并且工作正常。但现在我想为它添加代理支持。代理也有用户名和密码。 any1可以帮我这个。
谢谢

I am a newbie to Java and my first task is to parse some 10,000 urls and extract some info outta it, for this I am using Jsoup and its working fine. But now I want to add proxy support to it. The Proxies have a username and password too. Can any1 help me with this.Thanks

推荐答案

您不必通过Jsoup获取网页数据。这是我的解决方案,但它可能不是最好的。

You don't have to get the webpage data through Jsoup. Here's my solution, it may not be the best though.

  URL url = new URL("http://www.example.com/");
  Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress("127.0.0.1", 8080)); // or whatever your proxy is
  HttpURLConnection uc = (HttpURLConnection)url.openConnection(proxy);

  uc.connect();

    String line = null;
    StringBuffer tmp = new StringBuffer();
    BufferedReader in = new BufferedReader(new InputStreamReader(uc.getInputStream()));
    while ((line = in.readLine()) != null) {
      tmp.append(line);
    }

    Document doc = Jsoup.parse(String.valueOf(tmp));

就是这样。这通过代理获取html页面的源代码,然后用Jsoup解析它。

And there it is. This gets the source of the html page through a proxy and then parses it with Jsoup.

这篇关于如何向Jsoup添加代理支持(HTML解析器)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 12:14
查看更多