问题描述
我要验证然后解析此字符串(用引号引起来):
I am wanting to verify and then parse this string (in quotes):
string = "start: c12354, c3456, 34526; other stuff that I don't care about"
//Note that some codes begin with 'c'
我想验证字符串是否以'start:'开头并以';'结尾之后,我想让一个正则表达式解析出字符串.我尝试了以下python re代码:
I would like to verify that the string starts with 'start:' and ends with ';'Afterward, I would like to have a regex parse out the strings. I tried the following python re code:
regx = r"start: (c?[0-9]+,?)+;"
reg = re.compile(regx)
matched = reg.search(string)
print ' matched.groups()', matched.groups()
我尝试了不同的变体,但是我可以获取第一个或最后一个代码,但不能获取所有这三个代码的列表.
I have tried different variations but I can either get the first or the last code but not a list of all three.
还是我应该放弃使用正则表达式?
Or should I abandon using a regex?
已更新,以反映我忽略的部分问题空间并修复了字符串差异. 感谢您的所有建议-在这么短的时间内.
updated to reflect part of the problem space I neglected and fixed string difference. Thanks for all the suggestions - in such a short time.
推荐答案
在Python中,使用单个正则表达式是不可能的:组的每次捕获都将覆盖同一组的最后一次捕获(在.NET中,实际上是有可能的,因为引擎会区分捕获和分组.
In Python, this isn’t possible with a single regular expression: each capture of a group overrides the last capture of that same group (in .NET, this would actually be possible since the engine distinguishes between captures and groups).
最简单的解决方案是先 提取start:
和;
之间的部分,然后使用正则表达式返回所有匹配项,而不仅仅是单个匹配项,使用 re.findall('c?[0-9]+', text)
.
Your easiest solution is to first extract the part between start:
and ;
and then using a regular expression to return all matches, not just a single match, using re.findall('c?[0-9]+', text)
.
这篇关于python regex用于重复字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!