本文介绍了Selenium请求的HTTP标头中缺少引荐来源的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在用Selenium编写一些测试,并注意到,标头中缺少Referer.我编写了以下最小示例,以使用 https://httpbin.org/headers 进行测试:

I'm writing some tests with Selenium and noticed, that Referer is missing from the headers. I wrote the following minimal example to test this with https://httpbin.org/headers:

import selenium.webdriver

options = selenium.webdriver.FirefoxOptions()
options.add_argument('--headless')

profile = selenium.webdriver.FirefoxProfile()
profile.set_preference('devtools.jsonview.enabled', False)

driver = selenium.webdriver.Firefox(firefox_options=options, firefox_profile=profile)
wait = selenium.webdriver.support.ui.WebDriverWait(driver, 10)

driver.get('http://www.python.org')
assert 'Python' in driver.title

url = 'https://httpbin.org/headers'
driver.execute_script('window.location.href = "{}";'.format(url))
wait.until(lambda driver: driver.current_url == url)
print(driver.page_source)

driver.close()

哪些印刷品:

<html><head><link rel="alternate stylesheet" type="text/css" href="resource://content-accessible/plaintext.css" title="Wrap Long Lines"></head><body><pre>{
  "headers": {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
    "Accept-Encoding": "gzip, deflate, br",
    "Accept-Language": "en-US,en;q=0.5",
    "Connection": "close",
    "Host": "httpbin.org",
    "Upgrade-Insecure-Requests": "1",
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:64.0) Gecko/20100101 Firefox/64.0"
  }
}
</pre></body></html>

因此没有Referer.但是,如果我浏览到任何页面并手动执行

So there is no Referer. However, if I browse to any page and manually execute

window.location.href = "https://httpbin.org/headers"

在Firefox控制台中,按预期显示Referer .

in the Firefox console, Referer does appear as expected.

如以下评论所述,使用时

As pointed out in the comments below, when using

driver.get("javascript: window.location.href = '{}'".format(url))

代替

driver.execute_script("window.location.href = '{}';".format(url))

该请求确实包含Referer.另外,当使用Chrome而不是Firefox时,两种方法都包含Referer.

the request does include Referer. Also, when using Chrome instead of Firefox, both methods include Referer.

所以主要问题仍然存在:如上所述,使用Firefox发送请求时,为什么请求中缺少Referer?

So the main question still stands: Why is Referer missing in the request when sent with Firefox as described above?

推荐答案

Referer

来源: https://developer.mozilla.org /en-US/docs/Web/HTTP/Headers/Referer

但是:

  • 引荐资源是本地文件"或数据" URI.
  • 使用了不安全的HTTP请求,并使用安全协议(HTTPS)接收了引荐页.

来源: https://developer.mozilla.org /en-US/docs/Web/HTTP/Headers/Referer

Referer HTTP标头相关的一些隐私和安全风险:

There are some privacy and security risks associated with the Referer HTTP header:

来源: https://developer.mozilla .org/zh-CN/docs/Web/Security/Referer_header:_privacy_and_security_concerns#The_referrer_problem

Referer标头的角度来看,可以通过以下步骤缓解大多数安全风险:

From the Referer header perspective majority of security risks can be mitigated following the steps:

来源:

  • https://developer.mozilla.org/en-US/docs/Web/Security/Referer_header:_privacy_and_security_concerns#How_can_we_fix_this
  • https://geekthis.net/post/hide-http-referer-headers/#exit-page-redirect

我已经通过GeckoDriver/Firefox和ChromeDriver/Chrome组合执行了您的代码:

I have executed your code through both through GeckoDriver/Firefox and ChromeDriver/Chrome combination:

driver.get('http://www.python.org')
assert 'Python' in driver.title

url = 'https://httpbin.org/headers'
driver.execute_script('window.location.href = "{}";'.format(url))
WebDriverWait(driver, 10).until(lambda driver: driver.current_url == url)
print(driver.page_source)

观察:

  • 使用GeckoDriver/Firefox Referer: "https://www.python.org/"标头丢失,如下所示:

    Observation:

    • Using GeckoDriver/Firefox Referer: "https://www.python.org/" header was missing as follows:

          {
            "headers": {
              "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
              "Accept-Encoding": "gzip, deflate, br",
              "Accept-Language": "en-US,en;q=0.5",
              "Host": "httpbin.org",
              "Upgrade-Insecure-Requests": "1",
              "User-Agent": "Mozilla/5.0 (Windows NT 6.2; Win64; x64; rv:67.0) Gecko/20100101 Firefox/67.0"
            }
          }
      

    • 使用ChromeDriver/Chrome Referer: "https://www.python.org/"标头出现,如下所示:

    • Using ChromeDriver/Chrome Referer: "https://www.python.org/" header was present as follows:

          {
            "headers": {
              "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3",
              "Accept-Encoding": "gzip, deflate, br",
              "Accept-Language": "en-US,en;q=0.9",
              "Host": "httpbin.org",
              "Referer": "https://www.python.org/",
              "Upgrade-Insecure-Requests": "1",
              "User-Agent": "Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.80 Safari/537.36"
            }
          }
      

    • 在处理Referer标头时,GeckoDriver/Firefox似乎是一个问题.

      It seems to be an issue with GeckoDriver/Firefox in handling the Referer header.

      推荐人政策

      这篇关于Selenium请求的HTTP标头中缺少引荐来源的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-02 15:19