我试图使用pyparsing来匹配一个多行字符串,该字符串可以以类似于python的方式继续:
Test = "This is a long " \
"string"
我找不到让pyparsing识别这一点的方法。以下是我迄今为止所做的尝试:
import pyparsing as pp
src1 = '''
Test("This is a long string")
'''
src2 = '''
Test("This is a long " \
"string")
'''
_lp = pp.Suppress('(')
_rp = pp.Suppress(')')
_str = pp.QuotedString('"', multiline=True, unquoteResults=False)
func = pp.Word(pp.alphas)
function = func + _lp + _str + _rp
print src1
print function.parseString(src1)
print '-------------------------'
print src2
print function.parseString(src2)
最佳答案
问题是,有一个多行引用的字符串并不像你想的那样。多行带引号的字符串字面意思是--一个包含换行符的字符串:
import pyparsing as pp
src0 = '''
"Hello
World
Goodbye and go"
'''
pat = pp.QuotedString('"', multiline=True)
print pat.parseString(src0)
解析此字符串的输出将是
['Hello\n World\n Goodbye and go']
。据我所知,如果你想要一个类似于Python字符串的行为的字符串,你必须自己定义它:
import pyparsing as pp
src1 = '''
Test("This is a long string")
'''
src2 = '''
Test("This is a long"
"string")
'''
src3 = '''
Test("This is a long" \\
"string")
'''
_lp = pp.Suppress('(')
_rp = pp.Suppress(')')
_str = pp.QuotedString('"')
_slash = pp.Suppress(pp.Optional("\\"))
_multiline_str = pp.Combine(pp.OneOrMore(_str + _slash), adjacent=False)
func = pp.Word(pp.alphas)
function = func + _lp + _multiline_str + _rp
print src1
print function.parseString(src1)
print '-------------------------'
print src2
print function.parseString(src2)
print '-------------------------'
print src3
print function.parseString(src3)
这将产生以下输出:
Test("This is a long string")
['Test', 'This is a long string']
-------------------------
Test("This is a long"
"string")
['Test', 'This is a longstring']
-------------------------
Test("This is a long" \
"string")
['Test', 'This is a longstring']
注意:
Combine
类将各种带引号的字符串合并为一个单元,以便它们在输出列表中显示为单个字符串。反斜杠被抑制以便不作为输出字符串的一部分合并的原因。关于python - Python/Pyparsing-多行引号,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/20962093/