本文介绍了无法通过beautifulsoup webscraping python获取标签"rel"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在网站上测试beautifulsoup4 webscrape代码.已经完成了大多数操作,但是由于其位置的缘故,一个属性信息对我来说很难完成.

I am trying to test a beautifulsoup4 webscrape code on a website. Have done most of it but one attribute information due to its location is little tricky for me to accomplish.

代码如下:

span class="callseller-description-icon">
<a id="phone-lead" class="callseller-description-link" rel="0501365082" href="#">Show Phone Number</a>

我正在尝试此操作,但不确定是否可以

I am trying this but not sure if its okay

try:
        phone=soup.find('a',{'id':'phone-lead'})
        for a in phone:
            phone_result= str(a.get_text('rel').strip().encode("utf-8"))
        print "Phone information:", phone_result
    except StandardError as e:
        phone_result="Error was {0}".format(e)
        print phone_result

可能是我的错误.很难获得具有电话号码的 rel 信息

What is possibly my mistake. It kinda hard to get the rel information which has phone numbers

我得到的错误是

NavigableString object has no attribute get_text

推荐答案

find 返回元素而不是列表,如果要所有 a 标记,请使用 find_all 方法.同样,要获取 rel 属性,您还需要使用 .get()方法或字典查找.您还可以添加 rel = True 来仅获取具有"rel"属性的"a"标签.

find returns the element not a list, if you want all a tags, use the find_all method. Also to get the rel attribute you need to use the .get() method or dictionary lookup. You can also add rel=True to get only those "a" tags where with the "rel" attribute.

演示:

  • 使用 find()

>>> soup.find('a', {'id': 'phone-lead', 'rel': True}).get('rel')
['0501365082']

  • 使用 find_all :

    >>> for a in soup.find_all('a', {'id':'phone-lead', 'rel': True}):
    ...     print(a['rel'])
    ... 
    ['0501365082']
    

  • 要获取所有相关"列表,您可以使用列表理解

    To get a list of all "rel" you can use a list comprehensions

    >>> [rel for rel in a['rel'] for a in soup.find_all('a', {'id':'phone-lead', 'rel': True})]
    ['0501365082']
    

    这篇关于无法通过beautifulsoup webscraping python获取标签"rel"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

    09-24 20:02