我有这样的URL列表:
http://www.toto.com/bags/handbags/test1/
http://www.toto.com/bags/handbags/smt1/
http://www.toto.com/bags/handbags/test1/test2/
http://www.toto.com/bags/handbags/blabla1/blabla2/
http://www.toto.com/bags/handbags/smt1/smt2/
http://www.toto.com/bags/handbags/smt1/smt2/testing/
http://www.toto.com/bags/handbags/smt1/smt2/testing.html
我想要的是仅采用类似
http://www.toto.com/something/else/again/more
限于此,如果还有更多,则不接受。
你能帮我吗 ? :)
最佳答案
适当的正则表达式为:
^http://www.toto.com/(\w+/){4}$
过滤示例:
>>> for line in lines:
... if re.match(r'^http://www.toto.com/(\w+/){4}$', line):
... print line
...
http://www.toto.com/bags/handbags/test1/test2/
http://www.toto.com/bags/handbags/blabla1/blabla2/
http://www.toto.com/bags/handbags/smt1/smt2/
关于python - 正则表达式在特定的URL,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/36973729/