问题描述
我正在从此 url 中提取内容.
I'm extracting content from this url.
import requests
from bs4 import BeautifulSoup
url = 'https://www.collinsdictionary.com/dictionary/french-english/aimer'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0'}
soup = BeautifulSoup(requests.get(url, headers = headers).content, 'html.parser')
for script in soup.select('script, .hcdcrt, #ad_contentslot_1, #ad_contentslot_2'):
script.extract()
entry_name = soup.h2.text
content1 = ''.join(map(str, soup.select_one('.cB cB-def dictionary biling').contents))
然后我遇到了错误
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-84-e9cb11cd6b5d> in <module>
10
11 entry_name = soup.h2.text
---> 12 content1 = ''.join(map(str, soup.select_one('.cB cB-def dictionary biling').contents))
AttributeError: 'NoneType' object has no attribute 'contents'
另一方面,如果我将cB cB-def dictionary biling
替换为hom
,即content1 = ''.join(map(str, soup.select_one('.hom').contents))
,则代码运行良好.从下面的html结构中,我认为cB cB-def dictionary biling
和hom
非常相似.
On the other hand, if I replace cB cB-def dictionary biling
by hom
, i.e. content1 = ''.join(map(str, soup.select_one('.hom').contents))
then the code runs well. From below structure of the html, I think that cB cB-def dictionary biling
and hom
are very similar.
请您详细说明这种问题是如何产生的以及如何解决?
Could you please elaborate on how such problem arises and how to solve it?
推荐答案
尝试一下:
import requests
from bs4 import BeautifulSoup
url = 'https://www.collinsdictionary.com/dictionary/french-english/aimer'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0'}
soup = BeautifulSoup(requests.get(url, headers = headers).content, 'html.parser')
for script in soup.select('script, .hcdcrt, #ad_contentslot_1, #ad_contentslot_2'):
script.extract()
entry_name = soup.h2.text
content1 = ''.join(map(str, soup.select_one('.cB.cB-def.dictionary.biling').contents))
选择类时,其中的类是blank-spaces
,请用.
替换空格.
When you select classes and it is blank-spaces
in it you replace the space with .
.
cB
,cB-def
,dictionary
和biling
是四个不同的类.并且,如果您在其中留有空格,则脚本会在类cB
的标签内寻找类cB-def
的标签,依此类推....
cB
, cB-def
, dictionary
and biling
is four different classes. And if you let the spaces be there the script looking for a tag with class cB-def
inside of a tag with class cB
and so on....
这篇关于为什么错误"NoneType"对象没有属性“内容",而仅用两个相似的命令之一发生?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!