我可以使用以下正则表达式代码解析my_str:
([\w\s]*)\s(\w+)
但我想使用pyparsing。
我怎样才能做到这一点?
my_str = "aa234"
expected_result = ["aa234", ""]
my_str = "aa234 bbb2b ccc ddd eee"
expected_result = ["aa234 bbb2b ccc ddd", "eee"]
my_str = "aa234 bbb2b ccc ddd eee fff ggg hhh"
expected_result = ["aa234 bbb2b ccc ddd eee fff ggg", "hhh"]
最佳答案
这是您的示例解析器:
from pyparsing import *
stringWord = Word(alphas, alphanums)
# only want words not at the end of the string for the leading part
leadingWord = stringWord + ~LineEnd()
leadingPart = originalTextFor(stringWord + ZeroOrMore(leadingWord))
# define parser, with named results, similar to named groups in a regex
parser = leadingPart("first") + Optional(stringWord, default='')("second")
这是实际的工作方式:
tests = ["aa234",
"aa234 bbb2b ccc ddd eee ",]
for test in tests:
results = parser.parseString(test)
print results.dump()
print results.first
print results.second
印刷品:
['aa234', '']
- first: aa234
- second:
aa234
['aa234 bbb2b ccc ddd', 'eee']
- first: aa234 bbb2b ccc ddd
- second: eee
aa234 bbb2b ccc ddd
eee