有人可以发布一个使用IndentParser的小例子吗?我正在寻找解析类似于YAML的输入,如下所示:
fruits:
apples: yummy
watermelons: not so yummy
vegetables:
carrots: are orange
celery raw: good for the jaw
我知道有一个YAML软件包。我想学习IndentParser的用法。
最佳答案
我在下面草绘了一个解析器,对于您的问题,您可能只需要块
来自IndentParser的解析器。注意我没有尝试运行它,因此它可能有基本错误。
解析器的最大问题不是真正的缩进,而是您只有字符串和冒号作为标记。您可能会发现下面的代码需要进行大量调试,因为它必须对不消耗过多的输入非常敏感,尽管我尝试谨慎处理左因子分解。因为您只有两个 token ,所以从Parsec的 token 模块中获得的好处不多。
请注意,解析一个奇怪的事实是,看起来简单的格式通常不容易解析。为了学习,编写用于简单表达式的解析器将为您提供更多或多或少的任意文本格式(这可能只会使您感到沮丧)。
data DefinitionTree = Nested String [DefinitionTree]
| Def String String
deriving (Show)
-- Note - this might need some testing.
--
-- This is a tricky one, the parser has to parse trailing
-- spaces and tabs but not a new line.
--
category :: IndentCharParser st String
category = do
{ a <- body
; rest
; return a
}
where
body = manyTill1 (letter <|> space) (char ':')
rest = many (oneOf [' ', '\t'])
-- Because the DefinitionTree data type has two quite
-- different constructors, both sharing the same prefix
-- 'category' this combinator is a bit more complicated
-- than usual, and has to use an Either type to descriminate
-- between the options.
--
definition :: IndentCharParser st DefinitionTree
definition = do
{ a <- category
; b <- (textL <|> definitionsR)
; case b of
Left ss -> return (Def a ss)
Right ds -> return (Nested a ds)
}
-- Note this should parse a string *provided* it is on
-- the same line as the category.
--
-- However you might find this assumption needs verifying...
--
textL :: IndentCharParser st (Either DefinitionTrees a)
textL = do
{ ss <- manyTill1 anyChar "\n"
; return (Left ss)
}
-- Finally this one uses an indent parser.
--
definitionsR :: IndentCharParser st (Either a [DefinitionTree])
definitionsR = block body
where
body = do { a <- many1 definition; return (Right a) }