问题描述
我试图弄清楚如何将这种格式的字符串解析为任意深度的数据结构之类的树.
I'm trying to figure out how to parse a string in this format into a tree like data structure of arbitrary depth.
"{{Hello big|Hi|Hey} {world|earth}|{Goodbye|farewell} {planet|rock|globe{.|!}}}"
[[["Hello big" "Hi" "Hey"]
["world" "earth"]]
[["Goodbye" "farewell"]
["planet" "rock" "globe" ["."
"!"]]]]
我尝试为此使用一些正则表达式(例如#"{([^ {}] *)}"),但是我尝试过的所有事情似乎都将树压扁"成一个很大的列表.列表.我可能会从错误的角度来解决这个问题,或者正则表达式不是适合该工作的工具.
I've tried playing with some regular expressions for this (such as #"{([^{}]*)}" ), but everything I've tried seems to "flatten" the tree into a big list of lists. I could be approaching this from the wrong angle, or maybe a regex just isn't the right tool for the job.
感谢您的帮助!
推荐答案
请勿为此任务使用正则表达式.一种更简单的方法是使用语法(BNF或EBNF)描述字符串,然后编写一个解析器以根据语法解析字符串.您可以从EBNF和BNF生成一个解析树,因此自然就可以得到一个树结构.
Don't use regular expressions for this task. An easier method would be to describe your string with a grammar (BNF or EBNF) and then write a parser to parse the string according to the grammar. You can generate a parse-tree from your EBNF and BNF and so you naturally end up with a tree structure.
您可以从类似这样的内容开始:
You can start with something like this:
element ::= element-type, { ["|"], element-type }
element-type ::= primitive | "{", element, "}"
primitive ::= symbol | word
symbol ::= "." | "!"
word ::= character { character }
character ::= "a" | "b" | ... | "z"
注意:我很快就写下来,所以它可能并不完全正确.但这应该会给您一个想法.
Note: I wrote this up quickly, and so it may not be completely correct. But it should give you an idea.
这篇关于将字符串解析为树结构?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!