问题描述
我正在尝试使用 Python2.7 中的 BeautifulSoup (bs4) 包在 html 文档中查找以下标记:
I am trying to use the BeautifulSoup (bs4) package in Python2.7 to find the following tag in an html document:
<div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:408px; top:540px; width:14px; height:9px;"><span style="font-family: OEULZL+ArialMT; font-size:9px">0.00<br></span></div>
在 html 文档中有多个几乎完全相同的其他标签——唯一一致的区别是left:408px"和height:9px"属性.
In the html document there are multiple other tags that are almost exactly identical - the only consistently difference is the "left:408px" and the "height:9px" attributes.
我如何使用 BeautifulSoup
找到这个标签?
How can i find this tag using BeautifulSoup
?
我尝试了以下方法:
from bs4 import BeautifulSoup as bs
soup = bs("<div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:408px; top:540px; width:14px; height:9px;"><span style="font-family: OEULZL+ArialMT; font-size:9px">0.00<br></span></div>", 'html.parser')
soup.find_all('div', style=('left:408px' and 'height:9px'))
soup.find_all('div', style=('left:408px') and style=('height:9px')) #doesn't like style being used twice
soup.find_all('div', {'left':'408px' and 'height':'9px'})
soup.find_all('div', {'left:408px'} and {'height:9px'})
soup.find_all('div', style={'left':'408px' and 'height':'9px'})
soup.find_all('div', style={'left:408px'} and {'height:9px'})
有什么想法吗?
推荐答案
可以查看style
有left:408px
和height:9px代码>里面:
soup.find('div', style=lambda value: value and 'left:408px' in value and 'height:9px' in value)
或者:
import re
soup.find('div', style=re.compile(r'left:408px.*?height:9px'))
或者:
soup.select_one('div[style*="408px"]')
请注意,一般来说,样式属性用于定位元素并不可靠.查看是否还有其他内容 - 检查父元素、兄弟元素,或者元素附近是否有相应的标签.
Note that, in general, style properties are not reliable to use for locating elements. See if there is anything else - check the parent, sibling elements, or may be there is a corresponding label near the element.
请注意,更合适的 CSS 选择器应该是 div[style*="left:408px"][style*="height:9px"]
,但由于 有限的 CSS 选择器支持 和 这个错误,它不会按原样工作.
Note that, a more appropriate CSS selector would be div[style*="left:408px"][style*="height:9px"]
, but because of the limited CSS selector support and this bug, it is not gonna work as is.
这篇关于使用 BeautifulSoup 查找具有两种特定样式的标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!