import re
line = "..12345678910111213141516171820212223"
regex = re.compile(r'((?:[a-zA-Z0-9])\1+)')
print ("not coming here")
matches = re.findall(regex,line)
print (matches)
在上面的代码中,我试图捕获重复字符组。
例如,我需要这样的答案:
一百一十一
二百二十二
等。
但是当我运行上面的代码时,我得到一个错误:
Traceback (most recent call last):
File "First.py", line 3, in <module>
regex = re.compile(r'((?:[a-zA-Z0-9])\1+)')
File "C:\Users\bhatsubh\AppData\Local\Programs\Python\Python35\lib\re.py", lin
e 224, in compile
return _compile(pattern, flags)
File "C:\Users\bhatsubh\AppData\Local\Programs\Python\Python35\lib\re.py", lin
e 293, in _compile
p = sre_compile.compile(pattern, flags)
File "C:\Users\bhatsubh\AppData\Local\Programs\Python\Python35\lib\sre_compile
.py", line 536, in compile
p = sre_parse.parse(p, flags)
File "C:\Users\bhatsubh\AppData\Local\Programs\Python\Python35\lib\sre_parse.p
y", line 829, in parse
p = _parse_sub(source, pattern, 0)
File "C:\Users\bhatsubh\AppData\Local\Programs\Python\Python35\lib\sre_parse.p
y", line 437, in _parse_sub
itemsappend(_parse(source, state))
File "C:\Users\bhatsubh\AppData\Local\Programs\Python\Python35\lib\sre_parse.p
y", line 778, in _parse
p = _parse_sub(source, state)
File "C:\Users\bhatsubh\AppData\Local\Programs\Python\Python35\lib\sre_parse.p
y", line 437, in _parse_sub
itemsappend(_parse(source, state))
File "C:\Users\bhatsubh\AppData\Local\Programs\Python\Python35\lib\sre_parse.p
y", line 524, in _parse
code = _escape(source, this, state)
File "C:\Users\bhatsubh\AppData\Local\Programs\Python\Python35\lib\sre_parse.p
y", line 415, in _escape
len(escape))
sre_constants.error: cannot refer to an open group at position 16
有人请指引我哪里出错。
最佳答案
你(可能)想要
([a-zA-Z0-9])\1+
见a demo on regex101.com。
在
Python
中:import re
line = "..12345678910111213141516171820212223"
regex = re.compile(r'([a-zA-Z0-9])\1+')
matches = [match.group(0) for match in regex.finditer(line)]
print (matches)
# ['111', '222']
关于python - 如何使用正则表达式捕获重复字符集?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/46516128/