如何在没有类的情况下使用BeautifulSoup提取值

如何在没有类的情况下使用BeautifulSoup提取值

本文介绍了如何在没有类的情况下使用BeautifulSoup提取值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

html代码:

<td class="_480u">
    <div class="clearfix">
        <div>
            Female
        </div>
    </div>
</td>

我希望将值"Female"作为输出.

I wanted the value "Female" as an output.

我尝试了bs.findAll('div',{'class':'clearfix'}); bs.findAll('tag',{'class':'_480u'})但是这些类遍布我的html代码,并且输出很大.我想在搜索中合并{td-> class =".."和div-> class =".."},以便将输出作为Female.我该怎么办?

I tried bs.findAll('div',{'class':'clearfix'}) ; bs.findAll('tag',{'class':'_480u'})But these classes are all over my html code and the output is a big list. I wanted to incorporate {td --> class = ".." and div --> class = ".."} in my search, so that I get the output as Female. How can I do this?

谢谢

推荐答案

使用 stripped_strings 属性:

>>> from bs4 import BeautifulSoup
>>>
>>> html = '''<td class="_480u">
...     <div class="clearfix">
...         <div>
...             Female
...         </div>
...     </div>
... </td>'''
>>> soup = BeautifulSoup(html)
>>> print ' '.join(soup.find('div', {'class': 'clearfix'}).stripped_strings)
Female
>>> print ' '.join(soup.find('td', {'class': '_480u'}).stripped_strings)
Female

或将类指定为空字符串(或None)并使用string属性:

or specify class as empty string (or None) and use string property:

>>> soup.find('div', {'class': ''}).string
u'\n            Female\n        '
>>> soup.find('div', {'class': ''}).string.strip()
u'Female'

这篇关于如何在没有类的情况下使用BeautifulSoup提取值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-15 13:54