问题描述
我正在尝试使用 Python 2.6 中的 re 在更大的数字系列中查找每 10 位数字系列.
I'm trying to find every 10 digit series of numbers within a larger series of numbers using re in Python 2.6.
我可以轻松获取不重叠的匹配项,但我想要数字系列中的每个匹配项.例如.
I'm easily able to grab no overlapping matches, but I want every match in the number series. Eg.
在123456789123456789"
in "123456789123456789"
我应该得到以下列表:
[1234567891,2345678912,3456789123,4567891234,5678912345,6789123456,7891234567,8912345678,9123456789]
我发现了对前瞻"的引用,但我看到的示例只显示了数字对而不是更大的分组,而且我无法将它们转换为超过两位数.
I've found references to a "lookahead", but the examples I've seen only show pairs of numbers rather than larger groupings and I haven't been able to convert them beyond the two digits.
推荐答案
在前瞻中使用捕获组.前瞻捕获您感兴趣的文本,但实际匹配在技术上是前瞻之前的零宽度子字符串,因此匹配在技术上是不重叠的:
Use a capturing group inside a lookahead. The lookahead captures the text you're interested in, but the actual match is technically the zero-width substring before the lookahead, so the matches are technically non-overlapping:
import re
s = "123456789123456789"
matches = re.finditer(r'(?=(\d{10}))',s)
results = [int(match.group(1)) for match in matches]
# results:
# [1234567891,
# 2345678912,
# 3456789123,
# 4567891234,
# 5678912345,
# 6789123456,
# 7891234567,
# 8912345678,
# 9123456789]
这篇关于Python正则表达式找到所有重叠的匹配项?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!