我正在使用以下代码更改用户代理字符串,但是我想知道这是否会更改每个browser.get
请求的用户代理字符串吗?
ua_strings = [
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36',
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36',
'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101 Firefox/60.0',
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.1.1 Safari/605.1.15',
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36',
...
]
def parse(self, response):
profile = webdriver.FirefoxProfile()
profile.set_preference('general.useragent.override', random.choice(ua_string))
options = Options()
options.add_argument('-headless')
browser = webdriver.Firefox(profile, firefox_options=options)
browser.get(self.start_urls[0])
hrefs = WebDriverWait(browser, 60).until(
EC.visibility_of_all_elements_located((By.XPATH, '//div[@class="discoverableCard"]/a'))
)
pages = []
for href in hrefs:
pages.append(href.get_attribute('href'))
for page in pages:
browser.get(page)
""" scrape page """
browser.close()
还是我必须先
browser.close()
然后创建browser
的新实例,才能为每个请求使用新的用户代理字符串? for page in pages:
browser = webdriver.Firefox(profile, firefox_options=options)
browser.get(page)
""" scrape page """
browser.close()
最佳答案
由于最初已调用random.choice()
,因此用户代理字符串在所有browser.get()
请求中均相同。为确保用户代理始终随机,您可以创建一个set_preference()
函数,在每个循环中调用该函数。
def set_prefrences(self):
user_agent_string = random.choice(ua_string)
#print out user-agent on each loop
print(user_agent_string)
profile = webdriver.FirefoxProfile()
profile.set_preference('general.useragent.override', user_agent_string)
options = Options()
options.add_argument('-headless')
browser = webdriver.Firefox(profile, firefox_options=options)
return browser
然后在您的循环中可能是这样的:
for page in pages:
browser = set_preferences()
browser.get(page)
""" scrape page """
browser.close()
希望这可以帮助!
关于python - 为每次获取更改用户代理字符串,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/51276719/