问题描述
我补充说,从需要使用与验证的HTTPS连接的源擦伤的XML页面功能。我试图用Ryan Bates的Railscast#190解决方案,但我遇到一个401认证错误。
I am adding functionality that scrapes an XML page from a source that requires the use of an HTTPS connection with authentication. I am trying to use Ryan Bates' Railscast #190 solution but I'm running into a 401 Authentication error.
下面是我测试的Ruby脚本:
Here is my test Ruby script:
require 'rubygems'
require 'nokogiri'
require 'open-uri'
url = "https://biblesearch.americanbible.org/passages.xml?q[]=john+3:1-5&version=KJV"
doc = Nokogiri::XML(open(url, :http_basic_authentication => ['username' ,'password']))
puts doc.xpath("//text_preview")
下面是控制台的输出后,我跑我的脚本:
Here is the output of the console after I run my script:
/usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/net/http.rb:799:in `connect': SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed (OpenSSL::SSL::SSLError)
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/net/http.rb:799:in `block in connect'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/timeout.rb:54:in `timeout'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/timeout.rb:99:in `timeout'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/net/http.rb:799:in `connect'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/net/http.rb:755:in `do_start'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/net/http.rb:744:in `start'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/open-uri.rb:306:in `open_http'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/open-uri.rb:775:in `buffer_open'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/open-uri.rb:203:in `block in open_loop'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/open-uri.rb:201:in `catch'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/open-uri.rb:201:in `open_loop'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/open-uri.rb:146:in `open_uri'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/open-uri.rb:677:in `open'
from /usr/local/rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/open-uri.rb:33:in `open'
from scrape.rb:6:in `<main>'
在我的研究,我看到了一个帖子中,有人提出,在1.9.3可以用于下列选项:
In my research, I saw one post in which it was suggested that in 1.9.3 the following option could be used:
doc = Nokogiri::XML(open(url, :http_basic_authentication => ['username' ,'password'], :ssl_verify_mode => OpenSSL::SSL::VERIFY_NONE))
不过,这也不能工作。我想AP preciate一些见解应对这一挑战。
However, this did not work either. I would appreciate some insight into addressing this challenge.
推荐答案
给定的URL将被重定向到 /v1/KJV/passages.xml?q [] =约翰+ 3%3A1-5
,HTTP状态code 302找到
。 OpenURI理解重定向,但自动删除认证头(也许)出于安全原因。 (*)
The given URL will be redirected to /v1/KJV/passages.xml?q[]=john+3%3A1-5
with HTTP status code 302 Found
. OpenURI understands the redirection, but automatically deletes authentication header (maybe) for security reason. (*)
如果您访问http://biblesearch.americanbible.org/v1/KJV/passages.xml?q[]=john+3%3A1-5
直接,你会得到预期的结果。 : - )
If you access "http://biblesearch.americanbible.org/v1/KJV/passages.xml?q[]=john+3%3A1-5"
directly, you will get the expected result. :-)
(*)可以在开放式uri.rb找到
:
if redirect
### snip ###
if options.include? :http_basic_authentication
# send authentication only for the URI directly specified.
options = options.dup
options.delete :http_basic_authentication
end
这篇关于OpenUri造成401 HTTPS URL未经授权的错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!