我正在尝试获取Artist results
和Song results
,但是我不确定如何获得他想要的两个值。这就是我所拥有的:
albums = soup.find('div', attrs={'class':'panel-heading'}, text=re.compile('Artist results:'))
for i in albums.find_all('a'):
print (i)
我要从
artist results
中获得的内容:尽快岩石,尽快暴民,尽快费格同样,我想从
song results
中获取的信息是:T.I的ASAP,Eric Bellinger的ASAP等。<div class="panel">
<div class="panel-heading"><b>Artist results:</b><br><small>[1-3 of 3 total <span class="text-lowercase">Artists</span> found]</small></div>
<table class="table table-condensed">
<tr><td class="text-left visitedlyr">
1. <a href="https://www.azlyrics.com/a/asaprocky.html" target="_blank"><b>Asap Rocky</b></a></td>
</tr>
<tr><td class="text-left visitedlyr">
2. <a href="https://www.azlyrics.com/a/asapmob.html" target="_blank"><b>Asap Mob</b></a></td>
</tr>
<tr><td class="text-left visitedlyr">
3. <a href="https://www.azlyrics.com/a/asapferg.html" target="_blank"><b>Asap Ferg</b></a></td>
</tr>
</table>
</div>
<div class="panel">
<div class="panel-heading"><b>Song results:</b><br><small>[1-5 of 454 total <span class="text-lowercase">Songs</span> found]</small></div>
<table class="table table-condensed">
<tr><td class="text-left visitedlyr">
1. <a href="https://www.azlyrics.com/lyrics/ti/asap.html" target="_blank"><b>ASAP</b></a> by <b>T.I.</b><br>
<small>[Intro] <strong>Asap</strong>, <strong>asap</strong>, <strong>asap</strong> <strong>Asap</strong>, <strong>asap</strong>, <strong>asap</strong> Ay, ay, ay, ay, ay, you niggaz better exit <strong>Asap</strong>, <strong>asap</strong>, <strong>asap</strong>, <strong>asap</strong> Ay-s, ay-p, ay-s, ay-p <strong>Asap</strong>, <strong>asap</strong>, <strong>asap</strong>, <strong>asap</strong> Ay-s, ay-p, ay-s, ay-p <strong>Asap</strong>, <strong>asap</strong>, <strong>asap</strong>, <strong>asap</strong> A-s-a-p, A-S-A-P [Verse 1] I'm on my grind, grand h...</small></td>
</tr>
<tr><td class="text-left visitedlyr">
2. <a href="https://www.azlyrics.com/lyrics/ericbellinger/asap.html" target="_blank"><b>ASAP</b></a> by <b>Eric Bellinger</b><br>
<small>ou say? Cause girl I need that <strong>asap</strong> [Hook] Girl I need that <strong>asap</strong>, <strong>asap</strong>, <strong>asap</strong>, I need Girl I need that <strong>asap</strong>, <strong>asap</strong>, <strong>asap</strong>, I need Girl I need that <strong>asap</strong>, <strong>asap</strong>, <strong>asap</strong>, baby I need Girl I'm tryina taste that, taste that, I need Girl I need that <strong>asap</strong>, <strong>asap</strong>, ...</small></td>
</tr>
最佳答案
获得艺术家结果非常简单。首先,使用<b>Artist results:</b>
查找soup.find('b', text='Artist results:')
标记。然后使用find_next('table')
查找具有结果的表。
artists_table = soup.find('b', text='Artist results:').find_next('table')
artists = [x.text for x in artists_table.find_all('a')]
print(artists)
# ['Asap Rocky', 'Asap Mob', 'Asap Ferg']
要获取歌曲结果,请使用相同的方法获取表格。但是,要获取所需的文本,您必须进行一些更改。
songs_table = soup.find('b', text='Song results:').find_next('table')
songs = [' by '.join(b.text for b in td.find_all('b')) for td in songs_table.find_all('td')]
print(songs)
# ['ASAP by T.I.', 'ASAP by Eric Bellinger']
关于python - 标签内部标签中的Beautifulsoup文本,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/49828707/