问题描述
这个问题是针对Win10上的Python 3.6.3,bs4和Selenium 3.8.
This question is for Python 3.6.3, bs4 and Selenium 3.8 on Win10.
我正在尝试抓取具有动态内容的页面.我要抓取的是数字和文本(例如,来自 http://www.oddsportal.com ) .根据我的理解,使用请求+精美的汤将无法完成任务,因为动态内容将被隐藏.因此,我必须使用其他工具,例如我们的Selenium Webdriver.
I am trying to scrape pages with dynamic content. What I am trying to scrape is numbers and text (from http://www.oddsportal.com for example). From my understanding using requests+beautifulsoup will not do the job, as dynamic content will be hidden. So I have to use other tools such us selenium webdriver.
然后,考虑到我仍然要使用硒Web驱动程序,您是否建议忽略beautifulsoup并坚持使用硒Web驱动程序功能,例如
Then, given that I will use selenium webdriver anyway, do you recommend ignoring beautifulsoup and stick with selenium webdriver functions, eg
elem = driver.find_element_by_name("q"))
或者使用硒+美丽汤被认为是更好的做法?
or is it considered better practice to use selenium+beautifulsoup?
您对两条路线中的哪条路线会给我带来更便捷的功能有什么看法?
Do you have any opinion as to which of the two routes will give me more convenient functions to work with?
谢谢.
推荐答案
Beautifulsoup
Beautifulsoup
是 Web爬网的强大工具.它使用 urllib.request Python库. urllib.request
具有从静态页面提取数据的强大功能.
Beautifulsoup
Beautifulsoup
is a powerful tool for Web Scrapping. It use the urllib.request Python library. urllib.request
is quite powerful to extract data from static pages.
> Selenium
当前是最广泛接受且效率最高的网络自动化的工具.硒支持与Dynamic Pages, Contents and Elements
进行交互.
Selenium
is currently the most widely accepted and efficient tool for Web Automation. Selenium supports interacting with Dynamic Pages, Contents and Elements
.
要创建一个健壮高效的框架来抓取具有动态内容的页面,您必须将 Selenium
和 Beautifulsoup
都集成到框架中.通过 Selenium
浏览动态元素并与之交互,并通过 Beautifulsoup
To create a robust and efficient framework to scrape pages with dynamic content you must integrate both Selenium
and Beautifulsoup
in your framework. Browse and interact with dynamic elements through Selenium
and scrape the contents efficiently through Beautifulsoup
这是使用 Selenium
和 Beautifulsoup
表示 Scrapping
>
Here is an example
using Selenium
and Beautifulsoup
for Scrapping
这篇关于Python-认为更适合报废的方法:硒还是含硒的beautifulsoup?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!