问题描述
我使用isinstance选择一些html标签,并将其传递到Beautifulsoup功能。问题是我保持距离应该是什么完全执行code得到NameErrors。
I'm using isinstance to select some html tags and passing them to a Beautifulsoup function. The problem is I keep getting NameErrors from what should be perfectly executable code.
def horse_search(tag):
return (tag.has_attr('href') and isinstance(tag.previous_element, span))
...
for tag in soup.find_all(horse_search):
print (tag)
NameError:全局名称跨度没有定义
NameError: global name 'span' is not defined
另外,我从Beautifulsoup的文档中的例子code在标签。previous_element
Also I'm getting errors from the example code in the documentation of Beautifulsoup using isinstance in conjunction with tag.previous_element
def surrounded_by_strings(tag):
return (isinstance(tag.next_element, NavigableString)
and isinstance(tag.previous_element, NavigableString))
for tag in soup.find_all(surrounded_by_strings):
print tag.name
NameError:全局名称NavigableString没有定义
NameError: global name "NavigableString" is not defined
可能是什么问题?谢谢!
What could be wrong? Thanks!
推荐答案
要找到具有跨度家长和href属性做主播:
to find all anchors that has a span parent and an href attribute do:
for span in soup.find_all('span'):
for a in span.find_all('a'):
if a.has_attr('href'):
print a['href']
不过,虽然这是很好的,因为在大多数情况下,使用一些工具,它支持XPath可以更好,例如,使用LXML和XPath您code可以看起来像整齐的:
however, while this is nice, as in most cases, using some tool that supports xpath can be even better, for example, using lxml and xpath you code can look as neat as:
from lxml import etree
etree.parse(url, etree.HTMLParser()).xpath('//span/a/@href')
这篇关于isinstance不beautifulsoup正常工作(NameError)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!