有没有办法在Python中动态更新正则表达式组的名称?
例如,如果文本是:
person 1: name1
person 2: name2
person 3: name3
...
person N: nameN
在不事先知道有多少人的情况下,如何命名“ person1”,“ person2”,“ person3”,...和“ personN”分组?
最佳答案
不,但是您可以执行以下操作:
>>> import re
>>> p = re.compile('(?m)^(.*?)\\s*:\\s*(.*)$')
>>> text = '''person 1: name1
person 2: name2
person 3: name3
...
person N: nameN'''
>>> p.findall(text)
输出:
[('person 1', 'name1'), ('person 2', 'name2'), ('person 3', 'name3'), ('person N', 'nameN')]
快速说明:
(?m) # enable multi-line mode
^ # match the start of a new line
(.*?) # un-greedily match zero or more chars and store it in match group 1
\s*:\s* # match a colon possibly surrounded by space chars
(.*) # match the rest of the line and store it in match group 2
$ # match the end of the line
参考文献
多行模式:http://www.regular-expressions.info/modifiers.html
贪婪/不满意匹配:http://www.regular-expressions.info/repeat.html
匹配组http://www.regular-expressions.info/brackets.html