问题描述
我正在尝试将 Lynda.com 提供的所有课程的名称连同主题一起提取出来,以便它以2D 绘图 -- Soane 项目:与 Paul F. Aubin 一起使用 BIM 恢复丢失的纪念碑"的形式出现在我的列表中'.所以我正在尝试编写一个脚本,该脚本将转到 http://www.lynda.com/上的每个主题站点地图/类别 并拉出课程列表.我已经设法让 Selenium 从一个主题转到另一个主题并拉取课程.我唯一的问题是有一个按钮查看更多课程"可以查看其余课程.有时你必须点击它几次,这就是我使用 while 循环的原因.但是 selenium 似乎没有执行这个点击.有谁知道为什么?
I am trying to pull out the names of all courses offered by Lynda.com together with the subject so that it appears on my list as '2D Drawing -- Project Soane: Recover a Lost Monument with BIM with Paul F. Aubin'. So I am trying to write a script that will go to each subject on http://www.lynda.com/sitemap/categories and pull out the list of courses. I already managed to get Selenium to go from one subject to another and pull the courses. My only problem is that there is a button 'See X more courses' to see the rest of the courses. Sometimes you have to click it couple of times that´s why I used while loop. But selenium doesn´t seem to execute this click. Does anyone know why?
这是我的代码:
from selenium import webdriver
url = 'http://www.lynda.com/sitemap/categories'
mydriver = webdriver.Chrome()
mydriver.get(url)
course_list = []
for a in [1,2,3]:
for b in range(1,73):
mydriver.find_element_by_xpath('//*[@id="main-content"]/div[2]/div[3]/div[%d]/ul/li[%d]/a' % (a,b)).click()
while True:
#click the button 'See more results' as long as it´s available
try:
mydriver.find_element_by_xpath('//*[@id="main-content"]/div[1]/div[3]/button').click()
except:
break
subject = mydriver.find_element_by_tag_name('h1') # pull out the subject
courses = mydriver.find_elements_by_tag_name('h3') # pull out the courses
for course in courses:
course_list.append(str(subject.text)+" -- " + str(course.text))
# go back to the initial site
mydriver.get(url)
推荐答案
点击前滚动到元素:
see_more_results = browser.find_element_by_css_selector('button[class*=see-more-results]')
browser.execute_script('return arguments[0].scrollIntoView()', see_more_results)
see_more_results.click()
如何重复这些操作的一种解决方案可能是:
One solution how to repeat these actions could be:
def get_number_of_courses():
return len(browser.find_elements_by_css_selector('.course-list > li'))
number_of_courses = get_number_of_courses()
while True:
try:
button = browser.find_element_by_css_selector(CSS_SELECTOR)
browser.execute_script('return arguments[0].scrollIntoView()', button)
button.click()
while True:
new_number_of_courses = get_number_of_courses()
if (new_number_of_courses > number_of_courses):
number_of_courses = new_number_of_courses
break
except:
break
警告:使用内置显式等待总是比 while True
更好:
http://www.seleniumhq.org/docs/04_webdriver_advanced.jsp#explicit-waits
Caveat: it's always better to use build-in explicit wait than while True
:
http://www.seleniumhq.org/docs/04_webdriver_advanced.jsp#explicit-waits
这篇关于硒 - 单击按钮的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!