问题描述
我用 python 解析了一个网站.他们使用了很多重定向,并且通过调用 javascript 函数来实现.
I parse a website with python. They use a lot of redirects and they do them by calling javascript functions.
所以当我只是使用 urllib 来解析站点时,它对我没有帮助,因为我在返回的 html 代码中找不到目标 url.
So when I just use urllib to parse the site, it doesn't help me, because I can't find the destination url in the returned html code.
有没有办法访问 DOM 并从我的 Python 代码中调用正确的 javascript 函数?
Is there a way to access the DOM and call the correct javascript function from my python code?
我需要的只是 url,重定向带我去的地方.
All I need is the url, where the redirect takes me.
推荐答案
我研究了 Selenium.如果您运行的不是纯脚本(意味着您没有显示器并且无法启动普通"浏览器),那么解决方案实际上非常简单:
I looked into Selenium. And if you are not running a pure script (meaning you don't have a display and can't start a "normal" browser) the solution is actually quite simple:
from selenium import webdriver
driver = webdriver.Firefox()
link = "http://yourlink.com"
driver.get(link)
#this waits for the new page to load
while(link == driver.current_url):
time.sleep(1)
redirected_url = driver.current_url
对于我的用例来说,这已经足够了.Selenium 还可以与表单交互并将按键发送到网站.
For my usecase this is more than enough. Selenium can also interact with forms and send keystrokes to the website.
这篇关于获取网站上 javascript 重定向的最终目的地的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!