问题描述
我有以下代码:
WebClient webClient = new WebClient();
HtmlPage page = webClient.getPage("http://www.myland.co.il/%D7%9E%D7%97%D7%A9%D7%91-%D7%94%D7%A9%D7%A7%D7%99%D7%94");
代码失败,并出现com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException:找不到 http:404的代码: //www.myland.co.il/Scripts/swfobject_modified.js
The code fails with com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException: 404 Not Found for http://www.myland.co.il/Scripts/swfobject_modified.js
我确实在控制台输出中看到了我感兴趣的HTML页面.是否有办法抑制异常并获得HTML页面?该页面确实可以在真实的浏览器中正确加载.
I do see in the console output the HTML page I am interested in. Is there a way to supress the exception and get an Html page after all? The page does load correctly in a real browser.
推荐答案
是的,您可以使用 setThrowExceptionOnFailingStatusCode 忽略失败的状态代码,例如;
Yes, you can use setThrowExceptionOnFailingStatusCode to ignore failing status codes, something like;
WebClient webClient = new WebClient();
webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
HtmlPage page = webClient.getPage("http://www.myland.co.il/%D7%9E%D7%97%D7%A9%D7%91-%D7%94%D7%A9%D7%A7%D7%99%D7%94");
默认情况下通常为true,这将给出您正在描述的错误.
The default is normally true, which gives the error you're describing.
如果您运行的是HtmlUnit早于2.11的旧版本,则可以在WebClient本身上调用setThrowExceptionOnFailingStatusCode
,而不用调用getOptions()
返回的选项.在2.11或更高版本中,应按上述方式使用getOptions()
.
Just in case you're running an old version, with versions of HtmlUnit earlier than 2.11, setThrowExceptionOnFailingStatusCode
can be called on the WebClient itself instead of the options returned by getOptions()
. In 2.11 or later, you should use getOptions()
as above.
这篇关于使用HtmlUnit时找不到404的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!