问题描述
我想弄清楚如何找到不是年份的数字(我将年份定义为一个四位数宽的数字.)
比如我要接
112123
但不是1234
以避免日期(4 位数字).
如果正则表达式也选择了 12345
那很好,但不是解决这个问题所必需的
(注意:这些要求可能看起来很奇怪.它们是我坚持使用的更大解决方案的一部分)
如果lookbehind 和lookahead 可用,以下应该有效:
(?{1,3}|\d{5,})(?!\d)
说明:
(?<!\d) # 前一个字符不是数字(\d{1,3}|\d{5,}) # 1 到 3 之间,或 5 位或更多位,放在第 1 组(?!\d) # 下一个字符不是数字
如果您不能使用环视,以下应该有效:
\b(\d{1,3}|\d{5,})\b
说明:
\b # 字边界(\d{1,3}|\d{5,}) # 1 到 3 之间,或 5 位或更多位,放在第 1 组\b # 词边界
Python 示例:
>>>regex = re.compile(r'(?{1,3}|\d{5,})(?!\d)')>>>regex.findall('1 22 333 4444 55555 1234 56789')['1'、'22'、'333'、'55555'、'56789']I am trying to figure out how to find numbers that are not years (I'm defining a year as simply a number that is four digits wide.)
For example, I want to pick up
1
12
123
But NOT1234
in order to avoid dates (4 digits).
if the regex also picked up 12345
that is fine, but not necessary for solving this problem
(Note: these requirements may seem odd. They are part of a larger solution that I am stuck with)
If lookbehind and lookahead are available, the following should work:
(?<!\d)(\d{1,3}|\d{5,})(?!\d)
Explanation:
(?<!\d) # Previous character is not a digit
(\d{1,3}|\d{5,}) # Between 1 and 3, or 5 or more digits, place in group 1
(?!\d) # Next character is not a digit
If you cannot use lookarounds, the following should work:
\b(\d{1,3}|\d{5,})\b
Explanation:
\b # Word boundary
(\d{1,3}|\d{5,}) # Between 1 and 3, or 5 or more digits, place in group 1
\b # Word boundary
Python example:
>>> regex = re.compile(r'(?<!\d)(\d{1,3}|\d{5,})(?!\d)')
>>> regex.findall('1 22 333 4444 55555 1234 56789')
['1', '22', '333', '55555', '56789']
这篇关于正则表达式查找不包括四位数的数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!