从特定标签中删除样式BeautifulSoup

从特定标签中删除样式BeautifulSoup

本文介绍了从特定标签中删除样式BeautifulSoup/Python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有汤,我想删除所有段落的所有样式标签.因此,我想在整个汤中将<p style='blah' id='bla' class=...>更改为<p id='bla' class=...>.但是我不想触摸例如<img style='...'>标签.我该怎么办?

Let's say I have a soup and I'd like to remove all style tags for all the paragraphs. So I'd like to turn <p style='blah' id='bla' class=...> to <p id='bla' class=...> in the entire soup. But I don't want to touch, say, <img style='...'> tags. How would I do this?

推荐答案

想法是使用find_all('p')遍历所有p标签并删除样式属性:

The idea is to iterate over all p tags using find_all('p') and remove the style attribute:

from bs4 import BeautifulSoup


data = """
<body>
    <p style='blah' id='bla1'>paragraph1</p>
    <p style='blah' id='bla2'>paragraph2</p>
    <p style='blah' id='bla3'>paragraph3</p>
    <img style="awesome_image"/>
</body>"""


soup = BeautifulSoup(data, 'html.parser')
for p in soup.find_all('p'):
    if 'style' in p.attrs:
        del p.attrs['style']

print soup.prettify()

打印:

<body>
 <p id="bla1">
  paragraph1
 </p>
 <p id="bla2">
  paragraph2
 </p>
 <p id="bla3">
  paragraph3
 </p>
 <img style="awesome_image"/>
</body>

这篇关于从特定标签中删除样式BeautifulSoup/Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-30 22:38