问题描述
运行此程序时出现段错误.看起来像调试打印,但是当我调试它时,我只会得到无尽的回溯循环.如果有人可以帮助我指出正确的方向,我将不胜感激.我也很感激,如果可能的话,任何提示/技巧来清理这个语法.
I'm getting a segfault when I run this. It looks like the debugprints, but when I debug it I just get an endless loop of backtrace.If anyone can help point me in the right direction, I'd appreciate it.I'd also appreciate, if possible any tips/tricks for cleaning up thisgrammar.
谢谢!
//code here:
/***
*I-EBNF parser
*
*This defines a grammar for BNF.
*/
//Speeds up compilation times.
//This is a relatively small grammar, this is useful.
#define BOOST_SPIRIT_NO_PREDEFINED_TERMINALS
#define BOOST_SPIRIT_QI_DEBUG
#include <boost/config/warning_disable.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <boost/fusion/adapted.hpp>
#include <boost/fusion/support.hpp>
#include <vector>
#include <string>
#include <iostream>
namespace Parser
{
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
enum class RHSType
{
Terminal, Identifier
};
struct RHS
{
RHSType type;
std::string value;
};
struct Rule
{
std::string identifier; //lhs
std::vector<RHS> rhs;
};
}
//expose our structs to fusion:
BOOST_FUSION_ADAPT_STRUCT(
Parser::RHS,
(Parser::RHSType, type)
(std::string, value)
)
BOOST_FUSION_ADAPT_STRUCT(
Parser::Rule,
(std::string, identifier)
(std::vector<Parser::RHS>, rhs)
)
namespace Parser
{
typedef std::vector<Rule> RuleList;
//our grammar definition
template <typename Iterator>
struct Grammar: qi::grammar<Iterator, std::list<Rule>, ascii::space_type>
{
Grammar(): Grammar::base_type(rules)
{
qi::char_type char_;
letter = char_("a-zA-Z");
digit = char_('0', '9');
symbol = char_('[') | ']' | '[' | ']' | '(' | ')' | '<' | '>'
| '\'' | '\"' | '=' | '|' | '.' | ',' | ';';
character = letter | digit | symbol | '_';
identifier = letter >> *(letter | digit | '_');
terminal = (char_('\'') >> character >> *character >>
char_('\'')) | (char_('\"') >> character >> *character >> char_('\"'));
lhs = identifier;
rhs = terminal | identifier | char_('[') >> rhs >> char_(']')
| char_('{') >> rhs >> char_('}') | char_('(') >> rhs >> char_(')') |
rhs >> char_('|') >> rhs | rhs >> char_(',') >> rhs;
rule = identifier >> char_('=') >> rhs;
rules = rule >> *rule;
}
private:
qi::rule<Iterator, char(), ascii::space_type> letter, digit,
symbol, character;
qi::rule<Iterator, std::string(), ascii::space_type> identifier,
lhs, terminal;
qi::rule<Iterator, RHS, ascii::space_type> rhs;
qi::rule<Iterator, Rule, ascii::space_type> rule;
qi::rule<Iterator, std::list<Rule>, ascii::space_type> rules;
};
}
int main()
{
Parser::Grammar<std::string::const_iterator> parser;
boost::spirit::ascii::space_type space;
std::string input;
std::vector<std::string> output;
bool result;
while (std::getline(std::cin, input))
{
if (input.empty())
{
break;
}
std::string::const_iterator it, itEnd;
it = input.begin();
itEnd = input.end();
result = phrase_parse(it, itEnd, parser, space, output);
if (result && it == itEnd)
{
std::cout << "success" << std::endl;
}
}
return 0;
}
¹从[spirit-general]邮件列表中张贴的十字形: http://boost.2283326.n4.nabble.com/parser-segfault-tips-tricks-td4680336.html
推荐答案
2015年9月26日上午01:45,泰勒·利特菲尔德写道:
On 09/26/2015 01:45 AM, Littlefield, Tyler wrote:
首先,它不会编译.
它不应该编译,因为语法不会公开属性(您是说list<Rule>()
而不是list<Rule>
?).
It shouldn't compile, since the grammar doesn't expose an attribute (did you mean list<Rule>()
instead of list<Rule>
?).
但是您永远不能将其分配给output
变量(是std::vector<std::string>
?!?)
But you could never assign that to the output
variable (which is std::vector<std::string>
?!?)
同样,您在这里忘记了括号
Likewise you forget parentheses here
qi::rule<Iterator, RHS(), ascii::space_type> rhs;
qi::rule<Iterator, Rule(), ascii::space_type> rule;
qi::rule<Iterator, std::list<Rule>(), ascii::space_type> rules;
rhs规则具有无限的左递归:
The rhs rule has infinite left-recursion:
rhs = terminal
| identifier
| ('[' >> rhs >> ']')
| ('{' >> rhs >> '}')
| ('(' >> rhs >> ')')
| (rhs >> '|' >> rhs) // OOPS
| (rhs >> ',' >> rhs) // OOPS
;
这很可能可以解释崩溃的原因,因为它会导致堆栈溢出.
This likely explains the crashes because it leads to stackoverflow.
实时流录制(第1部分和第2部分)确切地显示了我首先清理语法并随后使事情真正可编译的步骤.
The livestream recording (part #1 and part #2) shows exactly the steps I took to cleanup the grammar first, and subsequently make things actually compile-worthy.
那里有很多工作:
- 清理:对插入使用隐式
qi::lit
([]
,{}
,()
和=
,|
,,
) - 使用kleene +代替
a >> *a
(不止一次) - 更喜欢
%
解析器来解析...列表 -
我不得不稍微绕开""RHS"周围的规则;在最后两个分支中有无限递归(请参见
// OOPS
).我通过引入纯"表达式规则(仅解析单个RHS
结构.它已将其重命名为Expression
.
- cleanup: use implicit
qi::lit
for the interpunction ([]
,{}
,()
and=
,|
,,
) - use kleene+ instead of
a >> *a
(in more than one occasion) - prefer the
%
parser to parse... lists I had to "wiggle" the rules around "RHS" a bit; There was the infinite recursion in the last two branches (see
// OOPS
). I fixed it by introducing a "pure" expression rule (which parses just a singleRHS
structure. I've renamed this type toExpression
.
列表"解析(识别由,
或|
分隔的表达式列表)被移到原始的rhs
规则中,我将其重命名为expr_list
以进行更详细的描述:
The "list" parsing (that recognizes lists of expressions separated by ,
or |
) is moved into the original rhs
rule, which I renamed to expr_list
to be more descriptive:
expression = qi::attr(Parser::ExprType::Terminal) >> terminal
| qi::attr(Parser::ExprType::Identifier) >> identifier
| qi::attr(Parser::ExprType::Compound) >> qi::raw [ '[' >> expr_list >> ']' ]
| qi::attr(Parser::ExprType::Compound) >> qi::raw [ '{' >> expr_list >> '}' ]
| qi::attr(Parser::ExprType::Compound) >> qi::raw [ '(' >> expr_list >> ')' ]
;
expr_list = expression % (char_("|,")) // TODO FIXME?
;
为了使合成的属性实际转换为RHS
(现在为:Expression
)类型,我们需要为第一个适应项实际上公开一个RHSType
(现在为:ExprType
)值.成员.您可以在上面的几行中看到我们为此目的使用qi::attr()
.
In order for the synthesized attributes to actually transform into the RHS
(now: Expression
) type, we need to actually expose a RHSType
(now: ExprType
) value for the first adapted member. You can see in the above lines that we used qi::attr()
for this purpose.
现代编译器和boost版本可以大大简化BOOST_FUSION_ADAPT_STRUCT
调用:
Modern compilers and boost versions can simplify the BOOST_FUSION_ADAPT_STRUCT
invocations considerably:
BOOST_FUSION_ADAPT_STRUCT(Parser::Expression, type, value)
BOOST_FUSION_ADAPT_STRUCT(Parser::Rule, identifier, expr_list)
我升级"了一些规则以成为 lexemes ,这意味着它们不服从船长.
I "upgraded" some rules to be lexemes, meaning they don't obey a skipper.
我还将船长更改为blank_type
,因为这不会跳过换行符.您可以使用它轻松地直接使用相同的语法来解析多行输入.做,例如:
I also changed the skipper to be blank_type
as that will not skip newline characters. You could use this to easily parse multi-line input directly with the same grammar. Do, e.g.:
rules = rule % qi::eol;
事不宜迟,这是一个有效的示例:
Without further ado, here is a working example:
#define BOOST_SPIRIT_DEBUG
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/include/qi.hpp>
#include <iostream>
#include <string>
#include <vector>
namespace Parser
{
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
enum class ExprType { Terminal, Identifier, Compound };
static inline std::ostream& operator<<(std::ostream& os, ExprType type) {
switch (type) {
case ExprType::Terminal: return os << "Terminal";
case ExprType::Identifier: return os << "Identifier";
case ExprType::Compound: return os << "Compound";
}
return os << "(unknown)";
}
struct Expression { // TODO make recursive (see `boost::make_recursive_variant`)
ExprType type;
std::string value;
};
using ExprList = std::vector<Expression>;
struct Rule {
std::string identifier; // lhs
ExprList expr_list;
};
}
//expose our structs to fusion:
BOOST_FUSION_ADAPT_STRUCT(Parser::Expression, type, value)
BOOST_FUSION_ADAPT_STRUCT(Parser::Rule, identifier, expr_list)
namespace Parser
{
typedef std::list<Rule> RuleList;
//our grammar definition
template <typename Iterator>
struct Grammar: qi::grammar<Iterator, RuleList(), ascii::blank_type>
{
Grammar(): Grammar::base_type(rules)
{
qi::char_type char_;
symbol = char_("[][]()<>\'\"=|.,;_ ");
character = qi::alpha | qi::digit | symbol;
identifier = qi::alpha >> *(qi::alnum | char_('_'));
// TODO capture strings including interpunction(?)
terminal = ('\'' >> +(character - '\'') >> '\'')
| ('\"' >> +(character - '\"') >> '\"');
expression = qi::attr(Parser::ExprType::Terminal) >> terminal
| qi::attr(Parser::ExprType::Identifier) >> identifier
| qi::attr(Parser::ExprType::Compound) >> qi::raw [ '[' >> expr_list >> ']' ]
| qi::attr(Parser::ExprType::Compound) >> qi::raw [ '{' >> expr_list >> '}' ]
| qi::attr(Parser::ExprType::Compound) >> qi::raw [ '(' >> expr_list >> ')' ]
;
expr_list = expression % (char_("|,")) // TODO FIXME?
;
// above accepts mixed separators:
// a, b, c | d, e
//
// original accepted:
//
// a, b, [ c | d ], e
// a| b| [ c , d ]| e
// a| b| [ c | d ]| e
// a, b, [ c , d ], e
rule = identifier >> '=' >> expr_list;
//rules = rule % qi::eol; // alternatively, parse multi-line input in one go
rules = +rule;
BOOST_SPIRIT_DEBUG_NODES((rules)(rule)(expr_list)(expression)(identifier)(terminal))
}
private:
qi::rule<Iterator, Expression(), ascii::blank_type> expression;
qi::rule<Iterator, ExprList(), ascii::blank_type> expr_list;
qi::rule<Iterator, Rule(), ascii::blank_type> rule;
qi::rule<Iterator, RuleList(), ascii::blank_type> rules;
// lexemes:
qi::rule<Iterator, std::string()> terminal, identifier;
qi::rule<Iterator, char()> symbol, character;
};
}
int main() {
using It = std::string::const_iterator;
Parser::Grammar<It> parser;
boost::spirit::ascii::blank_type blank;
std::string input;
while (std::getline(std::cin, input))
{
if (input.empty()) {
break;
}
It it = input.begin(), itEnd = input.end();
Parser::RuleList output;
bool result = phrase_parse(it, itEnd, parser, blank, output);
if (result) {
std::cout << "success\n";
for (auto& rule : output) {
std::cout << "\ntarget: " << rule.identifier << "\n";
for (auto& rhs : rule.expr_list) {
std::cout << "rhs: " << boost::fusion::as_vector(rhs) << "\n";
}
}
} else {
std::cout << "parse failed\n";
}
if (it != itEnd)
std::cout << "remaining unparsed: '" << std::string(it, itEnd) << "\n";
}
}
打印输出:
success
target: assigned1
rhs: (Identifier some_var)
rhs: (Terminal a 'string' value)
rhs: (Compound [ 'okay', "this", is_another, identifier, { ( "nested" ) } ])
rhs: (Terminal done)
success
target: assigned2
rhs: (Compound { a })
并启用调试(#define BOOST_SPIRIT_DEBUG
):
<rules>
<try>assigned1 = some_var</try>
<rule>
<try>assigned1 = some_var</try>
<identifier>
<try>assigned1 = some_var</try>
<success> = some_var | "a 'st</success>
<attributes>[[a, s, s, i, g, n, e, d, 1]]</attributes>
</identifier>
<expr_list>
<try> some_var | "a 'stri</try>
<expression>
<try> some_var | "a 'stri</try>
<terminal>
<try>some_var | "a 'strin</try>
<fail/>
</terminal>
<identifier>
<try>some_var | "a 'strin</try>
<success> | "a 'string' value</success>
<attributes>[[s, o, m, e, _, v, a, r]]</attributes>
</identifier>
<success> | "a 'string' value</success>
<attributes>[[Identifier, [s, o, m, e, _, v, a, r]]]</attributes>
</expression>
<expression>
<try> "a 'string' value" </try>
<terminal>
<try>"a 'string' value" |</try>
<success> | [ 'okay', "this",</success>
<attributes>[[a, , ', s, t, r, i, n, g, ', , v, a, l, u, e]]</attributes>
</terminal>
<success> | [ 'okay', "this",</success>
<attributes>[[Terminal, [a, , ', s, t, r, i, n, g, ', , v, a, l, u, e]]]</attributes>
</expression>
<expression>
<try> [ 'okay', "this", i</try>
<terminal>
<try>[ 'okay', "this", is</try>
<fail/>
</terminal>
<identifier>
<try>[ 'okay', "this", is</try>
<fail/>
</identifier>
<expr_list>
<try> 'okay', "this", is_</try>
<expression>
<try> 'okay', "this", is_</try>
<terminal>
<try>'okay', "this", is_a</try>
<success>, "this", is_another</success>
<attributes>[[o, k, a, y]]</attributes>
</terminal>
<success>, "this", is_another</success>
<attributes>[[Terminal, [o, k, a, y]]]</attributes>
</expression>
<expression>
<try> "this", is_another,</try>
<terminal>
<try>"this", is_another, </try>
<success>, is_another, identi</success>
<attributes>[[t, h, i, s]]</attributes>
</terminal>
<success>, is_another, identi</success>
<attributes>[[Terminal, [t, h, i, s]]]</attributes>
</expression>
<expression>
<try> is_another, identif</try>
<terminal>
<try>is_another, identifi</try>
<fail/>
</terminal>
<identifier>
<try>is_another, identifi</try>
<success>, identifier, { ( "n</success>
<attributes>[[i, s, _, a, n, o, t, h, e, r]]</attributes>
</identifier>
<success>, identifier, { ( "n</success>
<attributes>[[Identifier, [i, s, _, a, n, o, t, h, e, r]]]</attributes>
</expression>
<expression>
<try> identifier, { ( "ne</try>
<terminal>
<try>identifier, { ( "nes</try>
<fail/>
</terminal>
<identifier>
<try>identifier, { ( "nes</try>
<success>, { ( "nested" ) } ]</success>
<attributes>[[i, d, e, n, t, i, f, i, e, r]]</attributes>
</identifier>
<success>, { ( "nested" ) } ]</success>
<attributes>[[Identifier, [i, d, e, n, t, i, f, i, e, r]]]</attributes>
</expression>
<expression>
<try> { ( "nested" ) } ] </try>
<terminal>
<try>{ ( "nested" ) } ] |</try>
<fail/>
</terminal>
<identifier>
<try>{ ( "nested" ) } ] |</try>
<fail/>
</identifier>
<expr_list>
<try> ( "nested" ) } ] | </try>
<expression>
<try> ( "nested" ) } ] | </try>
<terminal>
<try>( "nested" ) } ] | "</try>
<fail/>
</terminal>
<identifier>
<try>( "nested" ) } ] | "</try>
<fail/>
</identifier>
<expr_list>
<try> "nested" ) } ] | "d</try>
<expression>
<try> "nested" ) } ] | "d</try>
<terminal>
<try>"nested" ) } ] | "do</try>
<success> ) } ] | "done"</success>
<attributes>[[n, e, s, t, e, d]]</attributes>
</terminal>
<success> ) } ] | "done"</success>
<attributes>[[Terminal, [n, e, s, t, e, d]]]</attributes>
</expression>
<success> ) } ] | "done"</success>
<attributes>[[[Terminal, [n, e, s, t, e, d]]]]</attributes>
</expr_list>
<success> } ] | "done"</success>
<attributes>[[Compound, [(, , ", n, e, s, t, e, d, ", , )]]]</attributes>
</expression>
<success> } ] | "done"</success>
<attributes>[[[Compound, [(, , ", n, e, s, t, e, d, ", , )]]]]</attributes>
</expr_list>
<success> ] | "done"</success>
<attributes>[[Compound, [{, , (, , ", n, e, s, t, e, d, ", , ), , }]]]</attributes>
</expression>
<success> ] | "done"</success>
<attributes>[[[Terminal, [o, k, a, y]], [Terminal, [t, h, i, s]], [Identifier, [i, s, _, a, n, o, t, h, e, r]], [Identifier, [i, d, e, n, t, i, f, i, e, r]], [Compound, [{, , (, , ", n, e, s, t, e, d, ", , ), , }]]]]</attributes>
</expr_list>
<success> | "done"</success>
<attributes>[[Compound, [[, , ', o, k, a, y, ', ,, , ", t, h, i, s, ", ,, , i, s, _, a, n, o, t, h, e, r, ,, , i, d, e, n, t, i, f, i, e, r, ,, , {, , (, , ", n, e, s, t, e, d, ", , ), , }, , ]]]]</attributes>
</expression>
<expression>
<try> "done"</try>
<terminal>
<try>"done"</try>
<success></success>
<attributes>[[d, o, n, e]]</attributes>
</terminal>
<success></success>
<attributes>[[Terminal, [d, o, n, e]]]</attributes>
</expression>
<success></success>
<attributes>[[[Identifier, [s, o, m, e, _, v, a, r]], [Terminal, [a, , ', s, t, r, i, n, g, ', , v, a, l, u, e]], [Compound, [[, , ', o, k, a, y, ', ,, , ", t, h, i, s, ", ,, , i, s, _, a, n, o, t, h, e, r, ,, , i, d, e, n, t, i, f, i, e, r, ,, , {, , (, , ", n, e, s, t, e, d, ", , ), , }, , ]]], [Terminal, [d, o, n, e]]]]</attributes>
</expr_list>
<success></success>
<attributes>[[[a, s, s, i, g, n, e, d, 1], [[Identifier, [s, o, m, e, _, v, a, r]], [Terminal, [a, , ', s, t, r, i, n, g, ', , v, a, l, u, e]], [Compound, [[, , ', o, k, a, y, ', ,, , ", t, h, i, s, ", ,, , i, s, _, a, n, o, t, h, e, r, ,, , i, d, e, n, t, i, f, i, e, r, ,, , {, , (, , ", n, e, s, t, e, d, ", , ), , }, , ]]], [Terminal, [d, o, n, e]]]]]</attributes>
</rule>
<rule>
<try></try>
<identifier>
<try></try>
<fail/>
</identifier>
<fail/>
</rule>
<success></success>
<attributes>[[[[a, s, s, i, g, n, e, d, 1], [[Identifier, [s, o, m, e, _, v, a, r]], [Terminal, [a, , ', s, t, r, i, n, g, ', , v, a, l, u, e]], [Compound, [[, , ', o, k, a, y, ', ,, , ", t, h, i, s, ", ,, , i, s, _, a, n, o, t, h, e, r, ,, , i, d, e, n, t, i, f, i, e, r, ,, , {, , (, , ", n, e, s, t, e, d, ", , ), , }, , ]]], [Terminal, [d, o, n, e]]]]]]</attributes>
</rules>
success
target: assigned1
rhs: (Identifier some_var)
rhs: (Terminal a 'string' value)
rhs: (Compound [ 'okay', "this", is_another, identifier, { ( "nested" ) } ])
rhs: (Terminal done)
<rules>
<try>assigned2 = { a }</try>
<rule>
<try>assigned2 = { a }</try>
<identifier>
<try>assigned2 = { a }</try>
<success> = { a }</success>
<attributes>[[a, s, s, i, g, n, e, d, 2]]</attributes>
</identifier>
<expr_list>
<try> { a }</try>
<expression>
<try> { a }</try>
<terminal>
<try>{ a }</try>
<fail/>
</terminal>
<identifier>
<try>{ a }</try>
<fail/>
</identifier>
<expr_list>
<try> a }</try>
<expression>
<try> a }</try>
<terminal>
<try>a }</try>
<fail/>
</terminal>
<identifier>
<try>a }</try>
<success> }</success>
<attributes>[[a]]</attributes>
</identifier>
<success> }</success>
<attributes>[[Identifier, [a]]]</attributes>
</expression>
<success> }</success>
<attributes>[[[Identifier, [a]]]]</attributes>
</expr_list>
<success></success>
<attributes>[[Compound, [{, , a, , }]]]</attributes>
</expression>
<success></success>
<attributes>[[[Compound, [{, , a, , }]]]]</attributes>
</expr_list>
<success></success>
<attributes>[[[a, s, s, i, g, n, e, d, 2], [[Compound, [{, , a, , }]]]]]</attributes>
</rule>
<rule>
<try></try>
<identifier>
<try></try>
<fail/>
</identifier>
<fail/>
</rule>
<success></success>
<attributes>[[[[a, s, s, i, g, n, e, d, 2], [[Compound, [{, , a, , }]]]]]]</attributes>
</rules>
success
target: assigned2
rhs: (Compound { a })
输完了
代码中还有一些待办事项.实施它们需要付出更多的努力,但是我不确定这实际上是正确的方向,所以我将等待反馈:)
Lose Ends
There are some TODO's left in the code. Implementing them would require some more effort, but I'm not sure it's actually the right direction to go in, so I'll await feedback :)
最基本的TODO之一就是代表AST固有表达式的递归性质.现在,我通过插入嵌套的Compound表达式的源字符串来解决"问题.
One of the elementary TODO's left would be to represent the recursive nature of expressions in the AST proper. Right now, I "copped out" by just inserting the source string for a nested Compound expression.
这篇关于精神分析器段错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!