问题描述
我用下面的code(来自retrieve使用Python和BeautifulSoup )从网页上的链接:
进口的httplib2
从BeautifulSoup进口BeautifulSoup,SoupStrainerHTTP =的httplib2.Http()
状态,响应= http.request('http://www.nytimes.com')在BeautifulSoup链路(响应,parseOnlyThese = SoupStrainer('一个')):
如果link.has_attr('href属性):
打印链接['href属性]
不过,我不明白为什么我收到以下错误信息:
回溯(最后最近一次调用):
文件C:\\用户\\ EANUAMA \\工作区\\ PatternExtractor的\\ src \\来源$ C $ cExtractor.py,13号线,上述<&模块GT;
如果link.has_attr('href属性):
类型错误:'NoneType'对象不是可调用
BeautifulSoup 3.2.0
Python 2.7版
编辑:
我试过了类似的问题提供解决方案(Type如果错误link.has_attr('href属性):类型错误:'NoneType'对象不是可调用),但它给我以下错误:
回溯(最后最近一次调用):
文件C:\\用户\\ EANUAMA \\工作区\\ PatternExtractor的\\ src \\来源$ C $ cExtractor.py,12号线,上述<&模块GT;
在BeautifulSoup(响应).find_all('A'中,href = TRUE)链接:
类型错误:'NoneType'对象不是可调用
首先:
You are using BeautifulSoup
version 3 which is no longer maintained. Switch to BeautifulSoup
version 4. Install it via:
pip install beautifulsoup4
and change your import to:
from bs4 import BeautifulSoup
Also:
Here link
is a Tag
instance which does not have an has_attr
method. This means that, remembering what a dot notation means in BeautifulSoup
, it would try to search for element has_attr
inside the link
element which results into nothing found. In other words, link.has_attr
is None
and obviously None('href')
results into an error.
Instead, do:
soup = BeautifulSoup(response, parse_only=SoupStrainer('a', href=True))
for link in soup.find_all("a", href=True):
print(link['href'])
FYI, here is a complete working code that I used to debug your problem (using requests
):
import requests
from bs4 import BeautifulSoup, SoupStrainer
response = requests.get('http://www.nytimes.com').content
for link in BeautifulSoup(response, parseOnlyThese=SoupStrainer('a', href=True)).find_all("a", href=True):
print(link['href'])
这篇关于BeautifulSoup没有工作,得到NoneType错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!