使用HtmlUnit时找不到404

使用HtmlUnit时找不到404

本文介绍了使用HtmlUnit时找不到404的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下代码:

WebClient webClient = new WebClient();
HtmlPage page = webClient.getPage("http://www.myland.co.il/%D7%9E%D7%97%D7%A9%D7%91-%D7%94%D7%A9%D7%A7%D7%99%D7%94");

代码失败,并出现com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException:找不到 http:404的代码: //www.myland.co.il/Scripts/swfobject_modified.js

The code fails with com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException: 404 Not Found for http://www.myland.co.il/Scripts/swfobject_modified.js

我确实在控制台输出中看到了我感兴趣的HTML页面.是否有办法抑制异常并获得HTML页面?该页面确实可以在真实的浏览器中正确加载.

I do see in the console output the HTML page I am interested in. Is there a way to supress the exception and get an Html page after all? The page does load correctly in a real browser.

推荐答案

是的,您可以使用 setThrowExceptionOnFailingStatusCode 忽略失败的状态代码,例如;

Yes, you can use setThrowExceptionOnFailingStatusCode to ignore failing status codes, something like;

WebClient webClient = new WebClient();
webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
HtmlPage page = webClient.getPage("http://www.myland.co.il/%D7%9E%D7%97%D7%A9%D7%91-%D7%94%D7%A9%D7%A7%D7%99%D7%94");

默认情况下通常为true,这将给出您正在描述的错误.

The default is normally true, which gives the error you're describing.

如果您运行的是HtmlUnit早于2.11的旧版本,则可以在WebClient本身上调用setThrowExceptionOnFailingStatusCode,而不用调用getOptions()返回的选项.在2.11或更高版本中,应按上述方式使用getOptions().

Just in case you're running an old version, with versions of HtmlUnit earlier than 2.11, setThrowExceptionOnFailingStatusCode can be called on the WebClient itself instead of the options returned by getOptions(). In 2.11 or later, you should use getOptions() as above.

这篇关于使用HtmlUnit时找不到404的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-31 08:40