本文介绍了在 Haskell 中实现语言解释器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在 Haskell 中实现一个命令式语言解释器(用于教育目的).但是我很难为我的解释器创建正确的架构:我应该如何存储变量?如何实现嵌套函数调用?我应该如何实现变量作用域?如何在我的语言中添加调试可能性?我应该使用 monads/monad 转换器/其他技术吗?等

I want to implement an imperative language interpreter in Haskell (for educational purposes). But it's difficult for me to create right architecture for my interpreter: How should I store variables? How can I implement nested function calls? How should I implement variable scoping? How can I add debugging possibilities in my language? Should I use monads/monad transformers/other techniques? etc.

有人知道关于这个主题的好文章/论文/教程/资源吗?

Does anybody know good articles/papers/tutorials/sources on this subject?

推荐答案

如果您是编写此类处理器的新手,我建议您暂时不要使用 monad,并首先专注于获得没有任何花哨的准系统实现或口哨声.

If you are new to writing this kind of processors, I would recommend to put off using monads for a while and first focus on getting a barebones implementation without any bells or whistles.

以下内容可以作为小教程.

The following may serve as a minitutorial.

我假设您已经解决了解析要为其编写解释器的程序的源文本的问题,并且您有一些类型可以捕获您语言的抽象语法.我在这里使用的语言非常简单,仅由整数表达式和一些基本语句组成.

I assume that you have already tackled the issue of parsing the source text of the programs you want to write an interpreter for and that you have some types for capturing the abstract syntax of your language. The language that I use here is very simple and only consists of integer expressions and some basic statements.

让我们先导入一些我们稍后会用到的模块.

Let us first import some modules that we will use in just a bit.

import Data.Function
import Data.List

命令式语言的本质在于它具有某种形式的可变变量.这里,变量简单地用字符串表示:

The essence of an imperative language is that it has some form of mutable variables. Here, variables simply represented by strings:

type Var = String

表达式

接下来,我们定义表达式.表达式由整数常量、变量引用和算术运算构成.

Expressions

Next, we define expressions. Expressions are constructed from integer constants, variable references, and arithmetic operations.

infixl 6 :+:, :-:
infixl 7 :*:, :/:

data Exp
  = C Int        -- constant
  | V Var        -- variable
  | Exp :+: Exp  -- addition
  | Exp :-: Exp  -- subtraction
  | Exp :*: Exp  -- multiplication
  | Exp :/: Exp  -- division

例如,将常量 2​​ 添加到变量 x 的表达式用 V "x" :+: C 2 表示.

For example, the expression that adds the constant 2 to the variable x is represented by V "x" :+: C 2.

语句语言相当少.我们有三种形式的语句:变量赋值、while 循环和序列.

The statement language is rather minimal. We have three forms of statements: variable assignments, while loops, and sequences.

infix 1 :=

data Stmt
  = Var := Exp      -- assignment
  | While Exp Stmt  -- loop
  | Seq [Stmt]      -- sequence

例如,用于交换"变量 xy 的值的语句序列可以用 Seq ["tmp" := 表示V "x", "x" := V "y", "y" := V "tmp"].

For example, a sequence of statements for "swapping" the values of the variables x and y can be represented by Seq ["tmp" := V "x", "x" := V "y", "y" := V "tmp"].

程序只是一个语句.

type Prog = Stmt

商店

现在,让我们转向实际的解释器.在运行程序时,我们需要跟踪分配给程序中不同变量的值.这些值只是整数,作为我们记忆"的表示,我们只使用由变量和值组成的对列表.

Stores

Now, let us move to the actual interpreter. While running a program, we need to keep track of the values assigned to the different variables in the programs. These values are just integers and as a representation of our "memory" we just use lists of pairs consisting of a variable and a value.

type Val = Int
type Store = [(Var, Val)]

评估表达式

通过将常量映射到它们的值、在存储中查找变量的值以及将算术运算映射到它们的 Haskell 对应项来评估表达式.

Evaluating expressions

Expressions are evaluated by mapping constants to their value, looking up the values of variables in the store, and mapping arithmetic operations to their Haskell counterparts.

eval :: Exp -> Store -> Val
eval (C n) r       = n
eval (V x) r       = case lookup x r of
                       Nothing -> error ("unbound variable `" ++ x ++ "'")
                       Just v  -> v
eval (e1 :+: e2) r = eval e1 r + eval e2 r
eval (e1 :-: e2) r = eval e1 r - eval e2 r
eval (e1 :*: e2) r = eval e1 r * eval e2 r
eval (e1 :/: e2) r = eval e1 r `div` eval e2 r

请注意,如果商店包含一个变量的多个绑定,lookup 会选择商店中最先出现的绑定.

Note that if the store contains multiple bindings for a variable, lookup selects the bindings that comes first in the store.

虽然表达式的计算不能改变存储的内容,但执行语句实际上可能会导致存储的更新.因此,执行语句的函数将存储作为参数并生成可能更新的存储.

While the evaluation of an expression cannot alter the contents of the store, executing a statement may in fact result in an update of the store. Hence, the function for executing a statement takes a store as an argument and produces a possibly updated store.

exec :: Stmt -> Store -> Store
exec (x := e) r                    = (x, eval e r) : r
exec (While e s) r | eval e r /= 0 = exec (Seq [s, While e s]) r
                   | otherwise     = r
exec (Seq []) r                    = r
exec (Seq (s : ss)) r              = exec (Seq ss) (exec s r)

请注意,在赋值的情况下,我们只是将更新变量的新绑定推送到存储中,从而有效地掩盖了该变量之前的所有绑定.

Note that, in the case of assignments, we simply push a new binding for the updated variable to the store, effectively shadowing any previous bindings for that variable.

运行程序简化为在初始存储的上下文中执行其顶级语句.

Running a program reduces to executing its top-level statement in the context of an initial store.

run :: Prog -> Store -> Store
run p r = nubBy ((==) `on` fst) (exec p r)

执行该语句后,我们会清除所有隐藏的绑定,以便我们可以轻松读取最终存储的内容.

After executing the statement we clean up any shadowed bindings, so that we can easily read off the contents of the final store.

例如,考虑以下程序计算存储在变量 n 中的数字的斐波那契数并将其结果存储在变量 x 中.

As an example, consider the following program that computes the Fibonacci number of the number stored in the variable n and stores its result in the variable x.

fib :: Prog
fib = Seq
  [ "x" := C 0
  , "y" := C 1
  , While (V "n") $ Seq
      [ "z" := V "x" :+: V "y"
      , "x" := V "y"
      , "y" := V "z"
      , "n" := V "n" :-: C 1
      ]
  ]

例如,在交互式环境中,我们现在可以使用我们的解释器来计算第 25 个斐波那契数:

For instance, in an interactive environment, we can now use our interpreter to compute the 25th Fibonacci number:

> lookup "x" $ run fib [("n", 25)]
Just 75025

一元解释

当然,在这里,我们正在处理一种非常简单且微小的命令式语言.随着您的语言变得更加复杂,您的解释器的实现也会变得更加复杂.例如,考虑在添加过程时需要添加哪些内容,并需要区分本地(基于堆栈)存储和全局(基于堆)存储.回到问题的那部分,然后您确实可以考虑引入 monad 以稍微简化解释器的实现.

Monadic Interpretation

Of course, here, we are dealing with a very simple and tiny imperative language. As your language gets more complex, so will the implementation of your interpreter. Think for example about what additions you need when you add procedures and need to distinguish between local (stack-based) storage and global (heap-based) storage. Returning to that part of your question, you may then indeed consider the introduction of monads to streamline the implementation of your interpreter a bit.

在上面的示例解释器中,有两个效果"可以被一元结构捕获:

In the example interpreter above, there are two "effects" that are candidates for being captured by a monadic structure:

  1. 商店的流通和更新.
  2. 遇到运行时错误时中止运行程序.(在上面的实现中,当发生此类错误时,解释器只会崩溃.)

第一个效果通常由状态 monad 捕获,第二个由错误 monad 捕获.让我们简要研究一下如何为我们的口译员做到这一点.

The first effect is typically captured by a state monad, the second by an error monad. Let us briefly investigate how to do this for our interpreter.

我们通过从标准库中再导入一个模块来准备.

We prepare by importing just one more module from the standard libraries.

import Control.Monad

我们可以使用 monad 转换器通过组合基本状态 monad 和基本错误 monad 来为我们的两个效果构建一个复合 monad.然而,在这里,我们只是一次构建复合单子.

We can use monad transformers to construct a composite monad for our two effects by combining a basic state monad and a basic error monad. Here, however, we simply construct the composite monad in one go.

newtype Interp a = Interp { runInterp :: Store -> Either String (a, Store) }

instance Monad Interp where
  return x = Interp $ 
 -> Right (x, r)
  i >>= k  = Interp $ 
 -> case runInterp i r of
               Left msg      -> Left msg
               Right (x, r') -> runInterp (k x) r'
  fail msg = Interp $ \_ -> Left msg

编辑 2018:Applicative Monad 提案

由于 Applicative Monad Proposal (AMP),每个 Monad 也必须是 Functor 和 Applicative 的实例.为此,我们可以添加

Since the Applicative Monad Proposal (AMP) every Monad must also be an instance of Functor and Applicative. To do this we can add

import Control.Applicative -- Otherwise you can't do the Applicative instance.

导入并像这样使 Interp 成为 Functor 和 Applicative 的实例

to the imports and make Interp an instance of Functor and Applicative like this

instance Functor Interp where
  fmap = liftM -- imported from Control.Monad

instance Applicative Interp where
  pure  = return
  (<*>) = ap -- imported from Control.Monad

编辑 2018 年底

对于store的读写,我们引入了有效的函数rdwr:

For reading from and writing to the store, we introduce effectful functions rd and wr:

rd :: Var -> Interp Val
rd x = Interp $ 
 -> case lookup x r of
         Nothing -> Left ("unbound variable `" ++ x ++ "'")
         Just v  -> Right (v, r)

wr :: Var -> Val -> Interp ()
wr x v = Interp $ 
 -> Right ((), (x, v) : r)

请注意,如果变量查找失败,rd 会生成一个 Left 包装的错误消息.

Note that rd produces a Left-wrapped error message if a variable lookup fails.

表达式求值器的 monadic 版本现在读取

The monadic version of the expression evaluator now reads

eval :: Exp -> Interp Val
eval (C n)       = do return n
eval (V x)       = do rd x
eval (e1 :+: e2) = do v1 <- eval e1
                      v2 <- eval e2
                      return (v1 + v2)
eval (e1 :-: e2) = do v1 <- eval e1
                      v2 <- eval e2
                      return (v1 - v2)
eval (e1 :*: e2) = do v1 <- eval e1
                      v2 <- eval e2
                      return (v1 * v2)
eval (e1 :/: e2) = do v1 <- eval e1
                      v2 <- eval e2
                      if v2 == 0
                        then fail "division by zero"
                        else return (v1 `div` v2)

对于 :/:,除以零会导致通过 Monad-method fail 产生错误消息,其中,对于 Interp,简化为将消息包装在 Left-value 中.

In the case for :/:, division by zero results in an error message being produced through the Monad-method fail, which, for Interp, reduces to wrapping the message in a Left-value.

对于语句的执行,我们有

For the execution of statements we have

exec :: Stmt -> Interp ()
exec (x := e)       = do v <- eval e
                         wr x v
exec (While e s)    = do v <- eval e
                         when (v /= 0) (exec (Seq [s, While e s]))
exec (Seq [])       = do return ()
exec (Seq (s : ss)) = do exec s
                         exec (Seq ss)

exec 的类型表明,语句不会产生值,而只会因为它们对存储的影响或它们可能触发的运行时错误而被执行.

The type of exec conveys that statements do not result in values but are executed only for their effects on the store or the run-time errors they may trigger.

最后,在函数 run 中,我们执行一元计算并处理其效果.

Finally, in the function run we perform a monadic computation and process its effects.

run :: Prog -> Store -> Either String Store
run p r = case runInterp (exec p) r of
            Left msg      -> Left msg
            Right (_, r') -> Right (nubBy ((==) `on` fst) r')

在交互环境中,我们现在可以重新审视我们的示例程序的解释:

In the interactive environment, we can now revisit the interpretation of our example program:

> lookup "x" `fmap` run fib [("n", 25)]
Right (Just 75025)

> lookup "x" `fmap` run fib []
Left "unbound variable `n'"

这篇关于在 Haskell 中实现语言解释器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-03 04:53