本文介绍了从URL获取内容时出错403的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试自动化流程。为此,我需要通过在1次运行中多次点击URL来获取XML,然后解析它。对于程序的1次运行,URL可以被击中4到25次之间的任何地方。这一切似乎都没有问题,直到返回403错误响应。

I am trying to automate a process. For that I need to fetch XML by hitting a URL, multiple times in 1 run, and then parse it. For 1 run of the program, the URL could be hit anywhere between 4 to 25 times. This all seems fine until a 403 error response is returned.

有趣的是,403每隔5或6次就会出现这个URL。

Interestingly, the 403 always comes up for every 5th or 6th time the URL is hit.

我正在使用JDOM来解析XML响应。

I am using JDOM to parse the XML response.

我已尝试过代码:

Document doc = builder.build(new InputSource(url.openStream()));

HttpURLConnection conn = (HttpURLConnection)url.openConnection();
conn.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB;     rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 (.NET CLR 3.5.30729)");
Document doc = builder.build(conn.getInputStream());

第二个我得到例外:

org.jdom.input.JDOMParseException: Error on line 1: White spaces are required between publicId and systemId.

有人可以帮我摆脱403.请注意我没有任何控制权如果需要进行更改,请参阅

Could someone please help me in getting rid of the 403. Please note that I do not have any control over the source if a change is required to be made as talked about here

另外,我不确定很有帮助。

Also, I am not sure if this link is helpful.

谢谢。



[更新1]:
这是以某种方式工作,而不必 sleep

try{
            doc = builder.build(conn.getInputStream());
        }catch(IOException ioEx){
            doc = builder.build(new InputSource(url.openStream()));
}


推荐答案

表示请求已被理解,但服务器拒绝处理它。查看您发送的标头。当失败时,运行 TRACE http方法来检索您正在执行的确切请求。

403 means that the request is understood but the server refuses to process it. Look the headers you send. And when fails run a TRACE http method to retrieve the exact petition you are performing.

当你坚持 http 您发送的连接以及您要执行的方法请求。

When you stablish an http connection you send along with the request the method you want to perform.

其中一种方法是 TRACE

执行方法,你可以在身体反应中看到你刚刚执行的请愿。所以你可以看看它是否仍然有效。

By performing a TRACE method you can see in the body response the petition you just performed. So you can see if it is still valid.

如果他们有任何机制,也许你超过了最大的请愿数量。

Maybe you are exceeding the max number of petitions if they had any mechanism.

这篇关于从URL获取内容时出错403的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-05 07:26