python - 用pyparsing(latex)解析嵌套的组(引用的字符串)

我想在LaTeX文件中解析可能嵌套的组：

import pyparsing as pp
qs = pp.QuotedString(quoteChar='{', endQuoteChar='}')
s = r'''{ This is a \textbf{\texttt{example}} of \textit{some $\mb{y}$ text} to parse.}'''
print qs.parseString(s)

但这是不对的（它在第一个右括号处停止）。输出为：

([' This is a \\textbf{\\texttt{example'], {})

如果我想要的只是小组，我如何获得可以迭代的结果，我正在考虑这样的回报：

{ This is a \textbf{\texttt{example}} of \textit{some $\mb{y}$ text} to parse.}
{\texttt{example}}
{example}
{some $\mb{y}$ text}
{y}

用例是测试LaTeX源文件是否存在常见的标记错误。

最佳答案

此处的关键是要嵌套方括号以使其与右方括号正确匹配。您所写的语法确实会停在第一个结束括号处，而不是匹配的结束括号处。解决方案是定义一种语法，以使新的左括号与另一部分匹配。

import pyparsing as pp

allSections = []
def rememberSection(m):
    allSections.append(''.join(m))
other = pp.Word(pp.printables.replace('{','').replace('}','') + ' \t\r\n')
section = pp.Forward()
section << ('{' + pp.OneOrMore(other | section) + '}').setParseAction(rememberSection)

s = r'''{ This is a \textbf{\texttt{example}} of \textit{some $\mb{y}$ text} to parse.}'''
print section.parseString(s)
print allSections

这将允许在节内的内容定义为除括号或另一节以外的所有内容。然后将每个撑杆与相应的闭合撑杆匹配。如果花括号不匹配，则会出现一个pyparsing.ParseException。

通常，所有标记都将作为标记列表返回，每个标记都与“ {”，“}”或一系列其他非大括号字符匹配。由于我们希望记住每个方括号表达式，因此这里的parseAction将它们添加到外部列表中。我不确定有任何更干净的方法来处理它，但这将构造allSections列表，其中包含所需的组。

关于python - 用pyparsing(latex)解析嵌套的组(引用的字符串)，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/18923701/