本文介绍了用于搜索的正则表达式并获取图像的src的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
假设我有一个包含以下代码段的html字符串。 ...< img class =employee thumbsrc =http://localhost/services/employee1.jpg /> ...
我想搜索这个标签是否可用,如果是的话,获取src url。 < img class =employee thumb
可用于唯一标识标签。
如何在python中执行此操作?
解决方案
使用正则表达式:
>>> import re
>>> str ='< img class =employee thumbsrc =http://localhost/services/employee1.jpg/>'
>>>如果re.search('img class =employee thumb',str):
... print re.findall('src =(。*?)',s,re.DOTALL)
...
['http://localhost/services/employee1.jpg']
使用lxml:
>>> from lxml import etree
>>> root = etree.fromstring(
...< html>
...< img class =employee thumbsrc =http://localhost/services/employee1.jpg />
...< / html>
...)
>>> print root.xpath(// img [@ class ='employee thumb'] / @ *)[1]
http://localhost/services/employee1.jpg
Assume I am having a html string containing the following code snippet.
... <img class="employee thumb" src="http://localhost/services/employee1.jpg" /> ...
I want to search whether this tag is available and if so get the src url. <img class="employee thumb"
can be used to uniquely identify the tag.
How to do this in python?
解决方案
Using Regular Expression :
>>> import re
>>> str = '<img class="employee thumb" src="http://localhost/services/employee1.jpg" />'
>>> if re.search('img class="employee thumb"', str):
... print re.findall ( 'src="(.*?)"', s, re.DOTALL)
...
['http://localhost/services/employee1.jpg']
Using lxml :
>>> from lxml import etree
>>> root = etree.fromstring("""
... <html>
... <img class="employee thumb" src="http://localhost/services/employee1.jpg" />
... </html>
... """)
>>> print root.xpath("//img[@class='employee thumb']/@*")[1]
http://localhost/services/employee1.jpg
这篇关于用于搜索的正则表达式并获取图像的src的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!