问题描述
我目前正在通过Python中的PhantomJS + Selenium运行浏览器测试.
I currently run browser tests via PhantomJS + Selenium in Python.
desired_capabilities = dict(DesiredCapabilities.PHANTOMJS)
desired_capabilities["phantomjs.page.settings.userAgent"] = ("Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.115 Safari/537.36")
driver = webdriver.PhantomJS(executable_path="./phantomjs", desired_capabilities=desired_capabilities)
driver.get('http://google.com')
这正常工作,除非我要访问的页面get
上具有重定向.
This works fine, unless the page I'm trying to get
has a redirect on it.
示例:
https://login.vrealizeair.vmware.com/
在这种情况下,get
无法正常工作.页面源为空:<html><head></head></body></html>
.
In this case, the get
doesn't work properly. The page source is empty: <html><head></head></body></html>
.
这是一个已知问题,其中发布了涉及向其中添加代码段的解决方案正确处理重定向.
This is a known issue with solutions posted that involve adding a snippet of code to handle redirects appropriately.
如果您正在使用Selenium运行测试(在我的第一个代码段中),如何/在何处添加此代码?它是desired_capabilties
的一部分吗?
How/where do you add this code if you're running tests with Selenium (in my first code snippet)? Is it part of desired_capabilties
?
示例:
page.onNavigationRequested = function(url, type, willNavigate, main) {
if (main && url!=myurl) {
myurl = url;
console.log("redirect caught")
page.close()
renderPage(url);
}
};
page.open(url, function(status) {
if (status==="success") {
console.log(myurl);
console.log("success")
page.render('yourscreenshot.png');
phantom.exit(0);
} else {
console.log("failed")
phantom.exit(1);
}
});
我用PhantomJS 1.9.8和2.0.1开发版进行了尝试.
I tried it with PhantomJS 1.9.8 and 2.0.1-development.
推荐答案
事实证明,由于错误SSL handshake failed
,无法对页面进行爬网.
It turns out the page couldn't be crawled due an error: SSL handshake failed
.
解决方案是使用以下行初始化驱动程序:
The solution is to use the following line to initialize the driver:
driver = webdriver.PhantomJS(executable_path="./phantomjs", service_args=['--ignore-ssl-errors=true'])
这篇关于使用PhantomJS + Selenium处理重定向的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!