本文介绍了从特定标签中删除样式BeautifulSoup/Python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
假设我有汤,我想删除所有段落的所有样式标签.因此,我想在整个汤中将<p style='blah' id='bla' class=...>
更改为<p id='bla' class=...>
.但是我不想触摸例如<img style='...'>
标签.我该怎么办?
Let's say I have a soup and I'd like to remove all style tags for all the paragraphs. So I'd like to turn <p style='blah' id='bla' class=...>
to <p id='bla' class=...>
in the entire soup. But I don't want to touch, say, <img style='...'>
tags. How would I do this?
推荐答案
想法是使用find_all('p')
遍历所有p
标签并删除样式属性:
The idea is to iterate over all p
tags using find_all('p')
and remove the style attribute:
from bs4 import BeautifulSoup
data = """
<body>
<p style='blah' id='bla1'>paragraph1</p>
<p style='blah' id='bla2'>paragraph2</p>
<p style='blah' id='bla3'>paragraph3</p>
<img style="awesome_image"/>
</body>"""
soup = BeautifulSoup(data, 'html.parser')
for p in soup.find_all('p'):
if 'style' in p.attrs:
del p.attrs['style']
print soup.prettify()
打印:
<body>
<p id="bla1">
paragraph1
</p>
<p id="bla2">
paragraph2
</p>
<p id="bla3">
paragraph3
</p>
<img style="awesome_image"/>
</body>
这篇关于从特定标签中删除样式BeautifulSoup/Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!