本文介绍了AttributeError:'str'对象没有属性'descendants'的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试抓取网站的特定内容.我希望能到达:
I'm trying to scrape a particular piece of a website. I'm hoping to get to:
<div class="inhoudsindicatie"><p><span class="hl0 highlightColor0">HR</span>: art. 81RO.</p></div>
,尤其是其中的第81RO条".
and in particular the "art. 81RO" part of it.
from selenium import webdriver
from bs4 import BeautifulSoup as soup
driver.get('http://uitspraken.rechtspraak.nl/inziendocument?id=ECLI:NL:HR:2014:3004&showbutton=true&keyword=HR%3a')
page=soup(driver.page_source, "html.parser")
details=soup.findAll("span",{"class":"hl0 highlightColor0"})
它返回:
AttributeError: 'str' object has no attribute 'descendants'
这对我的代码意味着什么?我阅读了有关后代的一般信息,并且我很确定自己听不懂.
What does this imply about my code ? I read the general information on descendants and I am quite sure I don't understand.
(我的主要兴趣是理解问题,解决问题是次要的,尽管当然受到高度赞赏)
(My main interest is in understanding the problem, solving it is secondary, though of course highly appreciated)
推荐答案
这对我有用:
import time
from selenium import webdriver
from bs4 import BeautifulSoup as soup
driver = webdriver.Chrome("/path/to/chromedriver")
driver.get('http://uitspraken.rechtspraak.nl/inziendocument?id=ECLI:NL:HR:2014:3004&showbutton=true&keyword=HR%3a')
time.sleep(5)
page = soup(driver.page_source, "html.parser")
details = page.select_one("span.hl0.highlightColor0").find_parent().get_text()
print(details)
driver.quit()
# output: HR: art. 81RO.
但是既然您还是在使用硒,为什么不坚持使用硒呢?
But since you're using selenium anyway, why not just stick with it?
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
driver = webdriver.Chrome("/path/to/chromedriver")
driver.get('http://uitspraken.rechtspraak.nl/inziendocument?id=ECLI:NL:HR:2014:3004&showbutton=true&keyword=HR%3a')
wait = WebDriverWait(driver, 10)
xpath = "//p/span[contains(@class, 'highlightColor0') and contains(@class, 'hl0')]/.."
details = wait.until(EC.visibility_of_element_located((By.XPATH, xpath)))
print(details.text)
driver.quit()
# output: HR: art. 81RO.
如果您不想使用"HR:"部分,可以将其删除:
If you don't want the 'HR:' part you can remove it:
details.split('HR: ')[1]
# output: art. 81RO.
这篇关于AttributeError:'str'对象没有属性'descendants'的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!