This是我为Python准备的Regex:
^(?<!(<!--.))(http(s?):)?([\/|\.|\w|\s|-])*\.(?:jpg|gif|png)$
当前表达式与此匹配:
/images/lol/hallo.png
但我需要它来匹配这个图像url:
/images/lol/hallo.png
以及这个没有周围标记的图像url:
<img src="/images/lol/hallo.png" />
但不是那些被评论掉的:
<!-- /images/lol/hallo.png -->
<!-- <img src="/images/lol/hallo.png" /> -->
最佳答案
这应该有效:
<!--[\s\S]*?-->|(?P<url>(http(s?):)?\/?\/?[^,;" \n\t>]+?\.(jpg|gif|png))
测试字符串:
<img src="/images/lol/hallo.png" />
/images/lol/hallo.png
/images/lol/hallo.png
//example.com/images/lol/hallo.png
http://example.com/images/lol/hallo.png
https://example.com/images/lol/hallo.png
<!-- /images/lol/commented.png -->
<!-- <img src="/images/lol/commented2.png" /> -->
images/ui/paper-icon-1.png
/images/lol/hallo.png and more here /images/lol/hallo.png
Python代码:
import re
x = '''
<img src="/images/lol/hallo.png" />
/images/lol/hallo.png
/images/lol/hallo.png
//example.com/images/lol/hallo.png
http://example.com/images/lol/hallo.png
https://example.com/images/lol/hallo.png
<!-- /images/lol/commented.png -->
<!-- <img src="/images/lol/commented2.png" /> -->
images/ui/paper-icon-1.png
/images/lol/hallo.png and more here /images/lol/hallo.png
'''
regexp = r'<!--[\s\S]*?-->|(?P<url>(http(s?):)?\/?\/?[^,;" \n\t>]+?\.(jpg|gif|png))'
result = [item[0] for item in re.findall(regexp, x) if item[0]]
for item in result:
print(item)
演示:https://regex101.com/r/YmXo2Q/4