我有这样的URL列表:

http://www.toto.com/bags/handbags/test1/
http://www.toto.com/bags/handbags/smt1/
http://www.toto.com/bags/handbags/test1/test2/
http://www.toto.com/bags/handbags/blabla1/blabla2/
http://www.toto.com/bags/handbags/smt1/smt2/
http://www.toto.com/bags/handbags/smt1/smt2/testing/
http://www.toto.com/bags/handbags/smt1/smt2/testing.html


我想要的是仅采用类似

http://www.toto.com/something/else/again/more


限于此,如果还有更多,则不接受。

你能帮我吗 ? :)

最佳答案

适当的正则表达式为:

^http://www.toto.com/(\w+/){4}$


过滤示例:

>>> for line in lines:
...     if re.match(r'^http://www.toto.com/(\w+/){4}$', line):
...         print line
...
http://www.toto.com/bags/handbags/test1/test2/
http://www.toto.com/bags/handbags/blabla1/blabla2/
http://www.toto.com/bags/handbags/smt1/smt2/

关于python - 正则表达式在特定的URL,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/36973729/

10-12 12:33
查看更多