问题描述
re.search和
grep之间存在巨大差异的原因是什么?
这个脚本大约需要5分钟才能在我的电脑上运行:
#!/ usr / bin / env python
导入重新
row =""
for a range(156000):
row + =" a"
print re.search(''[^=] * /' ',row)
做一个简单的grep:
grep''[^=] * /''输入(输入包含156.000 a
一行)
甚至不需要一秒钟。
这是python中的一个错误吗?
谢谢......
Henning Thornblad
What can be the cause of the large difference between re.search and
grep?
This script takes about 5 min to run on my computer:
#!/usr/bin/env python
import re
row=""
for a in range(156000):
row+="a"
print re.search(''[^ "=]*/'',row)
While doing a simple grep:
grep ''[^ "=]*/'' input (input contains 156.000 a in
one row)
doesn''t even take a second.
Is this a bug in python?
Thanks...
Henning Thornblad
推荐答案
请仔细阅读你的python代码。难道你不觉得读取文件和构建156000个字符串对象之间存在细微差别吗?
Please re-read carefully your python code. Don''t you think there''s a
subtle difference between reading a file and buildin 156000 string objects ?
请仔细阅读你的python代码。难道你不觉得读取文件和构建156000字符串
对象之间存在细微差别吗?
Please re-read carefully your python code. Don''t you think there''s a
subtle difference between reading a file and buildin 156000 string
objects ?
嗯...这个预留下来,经过测试(以一种更有效的方式构建字符串
),对re.search的调用实际上需要
年龄回归。请忘记我以前的帖子。
Mmm... This set aside, after testing it (building the string in a
somewhat more efficient way), the call to re.search effectively takes
ages to return. Please forget my previous post.
grep使用更智能的算法;)
grep uses a smarter algorithm ;)
您可以将此称为性能错误,但在真正的
代码中通常不足以获得必要的脑循环核心开发人员。
所以你可以自己编写补丁或使用解决方法。
re.search(''[^=] * /'',row)if/其他没有
可能还不错。
彼得
You could call this a performance bug, but it''s not common enough in real
code to get the necessary brain cycles from the core developers.
So you can either write a patch yourself or use a workaround.
re.search(''[^ "=]*/'', row) if "/" in row else None
might be good enough.
Peter
这篇关于re.search比一些正则表达式上的grep慢得多的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!