问题描述
在Haskell中,您可以从纯功能代码中抛出异常,但您只能捕获IO代码。
- 为什么? / li>
- 您可以在其他上下文中或仅捕获IO monad吗?
- 其他纯功能语言如何处理?
因为在函数中抛出异常不会使该函数的结果依赖于任何东西,参数值和函数的定义;该功能保持纯净。 (或至少可以)使OTOH 捕获不再是纯函数
含义中。他们只是一个令人不快的事实,我们必须处理,因为我们在现实世界中有实际的物理机器,这并不总是符合我们用来帮助我们编程的抽象。
如果一个函数抛出了这样一个异常,这意味着评估函数的一个特定尝试不能产生一个值。这并不一定意味着函数的结果是未定义的(在此时调用的参数),但是系统无法生成结果。
如果您可以在纯调用者中捕获这样的异常,您可以执行一些功能,当子计算成功完成时返回一个(非底部)值,另一个在内存不足时返回一个值。这不是纯粹的功能;由函数调用计算的值应该由其参数的值和函数的定义唯一确定。根据子计算的内存不足可以返回不同的内容,使得返回值取决于其他内容(物理机器上可用的内存多少,其他程序正在运行,操作系统及其策略等)等等);根据定义,可以以这种方式表现的功能不是纯粹的,并且不能(通常)存在于Haskell中。
由于纯粹的操作失败,我们必须允许评估功能可能产生底部而不是应该产生的值。这并不会完全破坏我们对Haskell程序的语义解释,因为我们知道底部将会导致所有的呼叫者产生底部(除非他们不需要应该计算的值,严格的评估意味着系统永远不会尝试评估这个功能并失败)。这听起来不错,但是当我们在 IO
monad中得到一个计算时,我们可以安全地捕获这样的异常。 IO
monad 中的值允许依赖于外部程序;事实上,他们可以根据世界上的任何东西来改变他们的价值(这就是为什么一个常见的 IO
价值观的解释就是他们就像传递了整个宇宙)。所以如果一个纯粹的子计算用完了内存,另一个结果就没有了,那么一个 IO
值就可以得到一个结果。
但是,确定性异常呢?在这里,我正在讨论在评估一组特定参数时的特定函数时始终的异常。这种异常包括除零错误,以及从纯函数明确抛出的任何异常(因为它的结果只能依赖于它的参数和它的定义,如果它评估为一次,一旦它将永远对同样的参数进行评估[1])。
可能看起来这类例外应该可以在纯代码。毕竟, 1/0
只是的值为是一个除以零的错误。如果一个函数可以有不同的结果,这取决于子计算是否通过检查是否通过零来评估为零除错,为什么不能通过检查结果是否是除以-zero错误?
这里我们回到larsmans发表评论的观点。如果纯函数可以观察到它从 throw ex1 + throw ex2
获取的异常,则其结果将依赖于执行顺序。但是这取决于运行时系统,并且在同一系统的两个不同执行之间可以想像甚至会发生变化。也许我们有一些先进的自动并行化实现,它会在每个执行中尝试不同的并行策略,以便尝试在多个运行中融合最佳策略。这将使异常捕获功能的结果取决于所使用的策略,机器中的CPU数量,机器上的负载,操作系统及其调度策略等。
同样,纯函数的定义是只有通过参数(及其定义)引入函数的信息才会影响其结果。在非 IO
函数的情况下,影响哪个异常被抛出的信息不会通过其参数或定义进入该函数,因此它不能有效果结果。但是,允许隐含的 IO
monad中的计算依赖于整个Universe的任何细节,因此捕获此类异常就很好。
至于您的第二个点,否则其他monads将无法捕捉异常。所有相同的论点都适用;生成的计算可能是
或 [y]
不应该依赖于其论点之外的任何东西,而是捕捉任何一种例外的泄漏各种关于不包括在这些函数参数中的细节的细节。
记住,没有什么特别的关于monads。他们的工作与Haskell的其他部分不同。 monad类型类在普通的Haskell代码中定义,几乎所有的monad实现也是如此。所有适用于普通Haskell代码的规则适用于所有monad。它是 IO
本身是特别的,而不是它是一个monad。
至于其他纯粹的语言如何处理异常捕获,我所经历的纯粹的纯粹的唯一其他语言是水星。[2] Mercury与Haskell有所不同,您可以在纯代码中捕获异常。
汞是一种逻辑编程语言,而不是建立在功能上Mercury程序是从谓词构建的;对谓词的调用可以有零个,一个或多个解决方案(如果您熟悉列表monad中的编程以获得非确定性,则有一点像整个语言在列表monad中)。在操作上,Mercury执行使用回溯来递归地枚举一个谓词的所有可能的解决方案,但非确定性谓词的语义是简单地具有每一组输入参数的一组解决方案,而不是一个Haskell函数,它为每组输入参数计算单个结果值。像Haskell一样,Mercury是纯粹的(包括I / O,虽然它使用的机制略有不同),因此每个对谓词的调用都必须唯一地确定一个单独的解决方案集,它只取决于参数和谓词的定义。
汞跟踪每个谓词的确定论。总是导致一个解决方案的谓词称为 det
(确定性的简称)。生成至少一个解决方案的那些称为 multi
。还有一些其他的决定论课程,但是这里并不相关。
使用尝试
块(或通过显式调用实现它的高阶谓词)具有确定性 cc_multi
。 cc代表承诺的选择。这意味着这个计算至少有一个解决方案,而且运行中的程序只能得到其中的一个。这是因为运行子计算并看看它是否产生了一个异常,它具有一个解决方案集合,它是子计算的正常解决方案与所有可能的异常集合的集合。由于所有可能的异常包括每个可能的运行时故障,其中大部分将永远不会发生,因此该解决方案集无法完全实现。没有可能的方式,执行引擎实际上可以追溯到 try
块的所有可能的解决方案,所以它只是给你一个解决方案正常的,或指示所有可能性被探索,没有解决方案或异常,或者第一例异常发生)。
因为编译器跟踪决定论,它将不允许您在完整的解决方案集合的上下文中调用 try
。您不能使用它来生成不遇到异常的所有解决方案,例如,因为编译器会抱怨需要所有的解决方案,一个 cc_multi
调用,只会生产一个。但是您也不能从 det
谓词中调用它,因为编译器会抱怨一个 det
谓词(其中应该有一个解决方案)正在制作一个 cc_multi
调用,这将有多个解决方案(我们只是要知道其中的一个)。 p>
那么地球上的这个有用吗?嗯,你可以将 main
(和其他所谓的,如果这很有用)声明为 cc_multi
,他们可以调用尝试
没有问题。这意味着整个程序在理论上有多个解决方案,但运行它将生成一个解决方案。这样就可以编写一个在某种程度上运行内存不足的程序。但是它并没有破坏声明性语义,因为在解决方案集中仍然会使用更多可用内存计算的真实结果(正如内存中的异常仍然在解决方案设置,当程序实际上计算一个值),这只是我们只有一个任意的解决方案。
重要的是 det
(只有一个解决方案)与 cc_multi
有不同的对待(有多种解决方案,但您只能有其中之一)。与在Haskell中捕获异常的推理类似,异常捕获不能在非承诺选择上下文中发生,或者您可以根据现实世界中的信息获得产生不同解集的纯粹谓词,有权访问。 cc_multi
确定性 try
允许我们编写程序,如同他们产生无限的解决方案设置(大多数都是不太可能的异常的小变种),并阻止我们编写实际需要多个解决方案的程序。[3]
[1]除非首先评估它遇到非确定性错误。
[2]只是鼓励程序员使用纯粹而不执行它的语言(如Scala)往往只是让你捕捉到任何你想要的异常请注意,承诺的选择概念不是Mercury处理纯I / O的概念,而是它们允许您在任何地方执行I / O。
。为此,水星使用独特的类型,它与承诺的选择决定论类正交。
In Haskell, you can throw an exception from purely functional code, but you can only catch in IO code.
- Why?
- Can you catch in other contexts or only the IO monad?
- How do other purely functional languages handle it?
Because throwing an exception inside a function doesn't make that function's result dependent on anything but the argument values and the definition of the function; the function remains pure. OTOH catching an exception inside a function does (or at least can) make that function no longer a pure function.
I'm going to look at two kinds of exception. The first is nondeterministic; such exceptions arise unpredictably at runtime, and include things like out of memory errors. The existence of these exceptions is not included in the meaning of the functions that might generate them. They're just an unpleasant fact of life we have to deal with because we have actual physical machines in the real world, which don't always match up to the abstractions we're using to help us program them.
If a function throws such an exception, it means that that one particular attempt to evaluate the function failed to produce a value. It doesn't necessarily mean that the function's result is undefined (on the arguments it was invoked on this time), but the system was unable to produce the result.
If you could catch such an exception within a pure caller, you could do things like have a function that returns one (non-bottom) value when a sub-computation completes successfully, and another when it runs out of memory. This doesn't make sense as a pure function; the value computed by a function call should be uniquely determined by the values of its arguments and the definition of the function. Being able to return something different depending on whether the sub-computation ran out of memory makes the return value dependent on something else (how much memory is available on the physical machine, what other programs are running, the operating system and its policies, etc etc); by definition a function which can behave this way is not pure and can't (normally) exist in Haskell.
Because of purely operational failures, we do have to allow that evaluating a function may produce bottom instead of the value it "should" have produced. That doesn't completely ruin our semantic interpretation of Haskell programs, because we know the bottom will cause all the callers to produce bottom as well (unless they didn't need the value that was supposed to be computed, but in that case non-strict evaluation implies that the system never would have tried to evaluate this function and failed). That sounds bad, but when we get out to a computation in the IO
monad that we can safely catch such exceptions. Values in the IO
monad are allowed to depend on things "outside" the program; in fact they can change their value dependent on anything in the world (this is why one common interpretation of IO
values is that they are as if they were passed a representation of the entire universe). So it's perfectly okay for an IO
value to have one result if a pure sub-computation runs out of memory and another result if it doesn't.
But what about deterministic exceptions? Here I'm talking about exceptions that are always thrown when evaluating a particular function on a particular set of arguments. Such exceptions include divide-by-zero errors, as well as any exception explicitly thrown from a pure function (since its result can only depend on its arguments and its definition, if it evaluates to a throw once it will always evaluate to the same throw for the same arguments[1]).
It might seem like this class of exceptions should be catchable in pure code. After all, the value of 1 / 0
just is a divide-by-zero error. If a function can have a different result depending on whether a sub-computation evaluates to a divide-by-zero error by checking whether it's passing in a zero, why can't it do this by checking whether the result is a divide-by-zero error?
Here we get back to the point larsmans made in a comment. If a pure function can observe which exception it gets from throw ex1 + throw ex2
, then its result becomes dependent on the order of execution. But that's up to the runtime system, and it could conceivably even change between two different executions of the same system. Maybe we've got some advanced auto-parallelising implementation which tries different parallelisation strategies on each execution in order to try to converge on the best strategy over multiple runs. This would make the result of the exception-catching function depend on the strategy being used, the number of CPUs in the machine, the load on the machine, the operating system and it's scheduling policies, etc etc.
Again, the definition of a pure function is that only information which comes into a function through its arguments (and its definition) should affect its result. In the case of non-IO
functions, the information affecting which exception gets thrown doesn't come into the function through its arguments or definition, so it can't have an effect on the result. But computations in the IO
monad implicitly are allowed to depend on any detail of the entire universe, so catching such exceptions is fine there.
As for your second dot point, no, other monads wouldn't work for catching exceptions. All the same arguments apply; computations producing Maybe x
or [y]
aren't supposed to depend on anything outside their arguments, and catching any kind of exception "leaks" all sorts of details about things which aren't included in those functions arguments.
Remember, there's nothing particularly special about monads. They don't work any differently than other parts of Haskell. The monad typeclass is defined in ordinary Haskell code, as are almost all monad implementations. All the same rules that apply to ordinary Haskell code apply to all monads. It's IO
itself that is special, not the fact that it's a monad.
As for how other pure languages handle exception catching, the only other language with enforced purity that I have experience with is Mercury.[2] Mercury does it a little differently from Haskell, and you can catch exceptions in pure code.
Mercury is a logic programming language, so rather than being built on functions Mercury programs are built from predicates; a call to a predicate can have zero, one, or more solutions (if you're familiar with programming in the list monad to get nondeterminism, it's a little bit like the entire language is in the list monad). Operationally, Mercury execution uses backtracking to recursively enumerate all possible solutions to a predicate, but the semantics of a nondeterministic predicate is that it simply has a set of solutions for each set of its input arguments, as opposed to a Haskell function which calculates a single result value for each set of its input arguments. Like Haskell, Mercury is pure (including I/O, though it uses a slightly different mechanism), so each call to a predicate must uniquely determine a single solution set, which depends only on the arguments and the definition of the predicate.
Mercury tracks the "determinism" of each predicate. Predicates which always result in exactly one solution are called det
(short for deterministic). Those which generate at least one solution are called multi
. There are a few other determinism classes as well, but they're not relevant here.
Catching an exception with a try
block (or by explicitly calling the higher-order predicates which implement it) has determinism cc_multi
. The cc stands for "committed choice". It means "this computation has at least one solution, and operationally the program is only going to get one of them". This is because running the sub-computation and seeing whether it produced an exception has a solution set which is the union of the sub-computation's "normal" solutions plus the set of all possible exceptions it could throw. Since "all possible exceptions" includes every possible runtime failure, most of which will never actually happen, this solution set can't be fully realised. There's no possible way the execution engine could actually backtrack through every possible solution to the try
block, so instead it just gives you a solution (either a normal one, or an indication that all possibilities were explored and there was no solution or exception, or the first exception that happened to arise).
Because the compiler keeps track of the determinism, it will not allow you to call try
in a context where the complete solution set matters. You can't use it to generate all solutions which don't encounter an exception, for example, because the compiler will complain that it needs all solutions to a cc_multi
call, which is only going to produce one. However you also can't call it from a det
predicate, because the compiler will complain that a det
predicate (which is supposed to have exactly one solution) is making a cc_multi
call, which will have multiple solutions (we're just only going to know what one of them is).
So how on earth is this useful? Well, you can have main
(and other things it calls, if that's useful) declared as cc_multi
, and they can call try
with no problems. This means that the entire program has multiple "solutions" in theory, but running it will generate a solution. This allows you to write a program that behaves differently when it happens to run out of memory at some point. But it doesn't spoil the declarative semantics because the "real" result it would have computed with more memory available is still in the solution set (just as the out-of-memory exception is still in the solution set when the program actually does compute a value), it's just that we only end up with one arbitrary solution.
It's important that det
(there is exactly one solution) is treated differently from cc_multi
(there are multiple solutions, but you can only have one of them). Similarly to the reasoning about catching exceptions in Haskell, exception catching can't be allowed to happen in a non-"committed choice" context, or you could get pure predicates producing different solution sets depending on information from the real world that they shouldn't have access to. The cc_multi
determinism of try
allows us to write programs as if they produced an infinite solution set (mostly full of minor variants of unlikely exceptions), and prevents us from writing programs that actually need more than one solution from the set.[3]
[1] Unless evaluating it encounters a nondeterministic error first. Real life's a pain.
[2] Languages which merely encourage the programmer to use purity without enforcing it (such as Scala) tend to just let you catch exceptions wherever you want, same as they allow you to do I/O wherever you want.
[3] Note that the "committed choice" concept is not how Mercury handles pure I/O. For that, Mercury uses unique types, which is orthogonal to the "committed choice" determinism class.
这篇关于为什么要捕获异常是非纯的,但抛出异常是纯的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!