re.search比一些正则表达式上的grep慢得多

本文介绍了re.search比一些正则表达式上的grep慢得多的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

re.search和

grep之间存在巨大差异的原因是什么？

这个脚本大约需要5分钟才能在我的电脑上运行：

＃！/ usr / bin / env python

导入重新

row =""

for a range（156000）：

row + =" a"

print re.search（''[^=] * /' '，row）

做一个简单的grep：

grep''[^=] * /''输入（输入包含156.000 a

一行）

甚至不需要一秒钟。

这是python中的一个错误吗？

谢谢......

Henning Thornblad

What can be the cause of the large difference between re.search and
grep?

This script takes about 5 min to run on my computer:
#!/usr/bin/env python
import re

row=""
for a in range(156000):
row+="a"
print re.search(''[^ "=]*/'',row)
While doing a simple grep:
grep ''[^ "=]*/'' input (input contains 156.000 a in
one row)
doesn''t even take a second.

Is this a bug in python?

Thanks...
Henning Thornblad

推荐答案

请仔细阅读你的python代码。难道你不觉得读取文件和构建156000个字符串对象之间存在细微差别吗？

Please re-read carefully your python code. Don''t you think there''s a
subtle difference between reading a file and buildin 156000 string objects ?

请仔细阅读你的python代码。难道你不觉得读取文件和构建156000字符串

对象之间存在细微差别吗？

Please re-read carefully your python code. Don''t you think there''s a
subtle difference between reading a file and buildin 156000 string
objects ?

嗯...这个预留下来，经过测试（以一种更有效的方式构建字符串
），对re.search的调用实际上需要

年龄回归。请忘记我以前的帖子。

Mmm... This set aside, after testing it (building the string in a
somewhat more efficient way), the call to re.search effectively takes
ages to return. Please forget my previous post.

grep使用更智能的算法;）

grep uses a smarter algorithm ;)

您可以将此称为性能错误，但在真正的

代码中通常不足以获得必要的脑循环核心开发人员。

所以你可以自己编写补丁或使用解决方法。

re.search（''[^=] * /''，row）if/其他没有

可能还不错。

彼得

You could call this a performance bug, but it''s not common enough in real
code to get the necessary brain cycles from the core developers.
So you can either write a patch yourself or use a workaround.

re.search(''[^ "=]*/'', row) if "/" in row else None

might be good enough.

Peter

这篇关于re.search比一些正则表达式上的grep慢得多的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！