本文介绍了用于搜索的正则表达式并获取图像的src的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个包含以下代码段的html字符串。

  ...< img class =employee thumbsrc =http://localhost/services/employee1.jpg /> ... 

我想搜索这个标签是否可用,如果是的话,获取src url。 < img class =employee thumb可用于唯一标识标签。



如何在python中执行此操作?

解决方案

使用正则表达式:

 >>> import re 
>>> str ='< img class =employee thumbsrc =http://localhost/services/employee1.jpg/>'
>>>如果re.search('img class =employee thumb',str):
... print re.findall('src =(。*?)',s,re.DOTALL)
...
['http://localhost/services/employee1.jpg']

使用lxml:

 >>> from lxml import etree 
>>> root = etree.fromstring(
...< html>
...< img class =employee thumbsrc =http://localhost/services/employee1.jpg />
...< / html>
...)
>>> print root.xpath(// img [@ class ='employee thumb'] / @ *)[1]
http://localhost/services/employee1.jpg


Assume I am having a html string containing the following code snippet.

... <img class="employee thumb" src="http://localhost/services/employee1.jpg" /> ... 

I want to search whether this tag is available and if so get the src url. <img class="employee thumb" can be used to uniquely identify the tag.

How to do this in python?

解决方案

Using Regular Expression :

>>> import re
>>> str =  '<img class="employee thumb" src="http://localhost/services/employee1.jpg" />'
>>> if re.search('img class="employee thumb"', str):
...     print re.findall ( 'src="(.*?)"', s, re.DOTALL)
... 
['http://localhost/services/employee1.jpg']

Using lxml :

>>> from lxml import etree
>>> root = etree.fromstring("""
... <html>
...     <img class="employee thumb" src="http://localhost/services/employee1.jpg" />
... </html>
... """)
>>> print root.xpath("//img[@class='employee thumb']/@*")[1]
http://localhost/services/employee1.jpg

这篇关于用于搜索的正则表达式并获取图像的src的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

11-01 18:03