我尝试使用包含网站分页的div
提取class='no-selected-number extreme-number'
,但是没有得到预期的结果。谁能帮我?
下面是我的代码:
import requests from bs4 import BeautifulSoup
URL ="https://www.falabella.com.pe/falabella-pe/category/cat40703/Perfumes-de-Mujer/"
headers = {'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538 Safari/537.36'}
r = requests.get(URL, headers=headers, timeout=5) html = r.content
soup = BeautifulSoup(html, 'lxml') box_3 =
soup.find_all('div','fb-filters-sort')
for div in box_3:
last_page = div.find_all("div",{"class","no-selected-number extreme-number"})
print(last_page)
最佳答案
您可能需要一种允许时间加载页面的方法,例如使用硒。我认为requests
不会提供您需要的数据。
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")
url ="https://www.falabella.com.pe/falabella-pe/category/cat40703/Perfumes-de-Mujer/"
d = webdriver.Chrome(chrome_options=chrome_options)
d.get(url)
print(d.find_element_by_css_selector('.content-items-number-list .no-selected-number.extreme-number:last-child').text)
d.quit()
关于python - Python BeautifulSoup-无法读取网站分页,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/53340307/