问题描述
Haskell(使用GHC
编译器)是比您预期的要快 .正确使用它可以接近低级语言. (Haskellers最喜欢做的事情是尝试获得5%的C(甚至击败它),但这意味着您使用的是效率低下的C程序,因为GHC将Haskell编译为C.)我的问题是,为什么?
Haskell (with the GHC
compiler) is a lot faster than you'd expect. Used correctly, it can get close-ish to low-level languages. (A favorite thing for Haskellers to do is to try and get within 5% of C (or even beat it, but that means you are using an inefficient C program, since GHC compiles Haskell to C).) My question is, why?
Haskell是声明性的,并且基于lambda演算.机器体系结构显然是必不可少的,它大致基于图灵机.实际上,Haskell甚至没有特定的评估顺序.另外,不用处理机器数据类型,而是始终创建代数数据类型.
Haskell is declarative and based on lambda calculus. Machine architectures are clearly imperative, being based on turing machines, roughly. Indeed, Haskell doesn't even have a specific evaluation order. Also, instead of dealing with machine data types, you make algebraic data types all the time.
最奇怪的是高阶函数.您可能会认为,动态创建函数并将其扔掉会使程序变慢. 但使用实际上,更高阶的函数可以使Haskell更快.的确,为了优化Haskell代码,您需要使其更优雅,更抽象,而不是像机器一样.如果不能改善Haskell的更高级功能,它们似乎都不会影响.
Weirdest of all though is higher order functions. You would think that creating functions on the fly, and throwing them around, would make a program slower. But using higher order functions actually makes Haskell faster. Indeed, it seems that, to optimize Haskell code, you need to make it more elegant and abstract instead of more machine-like. None of Haskell's more advanced features seem to even affect its performance, if they don't improve it.
很抱歉,这听起来是否很可靠,但这是我的问题:考虑到Haskell(与GHC一起编译)为什么如此之快,考虑到它的抽象性质和与物理机器的区别,为什么?
Sorry if this is sounding ranty, but here is my question: Why is Haskell (compiled with GHC) so fast, considering its abstract nature and differences from physical machines?
注意:我说C和其他命令式语言在某种程度上类似于Turing Machines(但在某种程度上来说,Haskell与Lambda微积分相似)的原因是,在命令式语言中,您具有有限数量的状态(又称状态")行号)以及磁带(滑枕),以便状态和当前磁带确定对磁带执行的操作.请参阅Wikipedia条目图灵机等效物,以了解从图灵机到计算机的过渡.
Note: The reason I say C and other imperative languages are somewhat similar to Turing Machines (but not to the extent that Haskell is similar to Lambda Calculus) is that in an imperative language, you have a finite number of states (a.k.a. line number), along with a Tape (the ram), such that the state and the current tape determine what to do to the tape. See the Wikipedia entry, Turing machine equivalents, for the transition from Turing Machines to computers.
推荐答案
我同意Dietrich Epp的观点:它是几件事的组合,可以使GHC更快.
I agree with Dietrich Epp: it's a combination of several things that make GHC fast.
首先,Haskell是非常高级的.这样,编译器就可以执行积极的优化而不会破坏您的代码.
First and foremost, Haskell is very high-level. This enables the compiler to perform aggressive optimisations without breaking your code.
考虑一下SQL.现在,当我编写SELECT
语句时,它可能看起来像一个命令式循环,但不是.看起来 会遍历该表中的所有行,以查找与指定条件匹配的行,但实际上 编译器"(数据库引擎)可能是而是进行索引查找—具有完全不同的性能特征.但是由于SQL如此高级,编译器"可以替代完全不同的算法,透明地应用多个处理器或I/O通道或整个服务器,等等.
Think about SQL. Now, when I write a SELECT
statement, it might look like an imperative loop, but it isn't. It might look like it loops over all rows in that table trying to find the one that matches the specified conditions, but actually the "compiler" (the DB engine) could be doing an index lookup instead — which has completely different performance characteristics. But because SQL is so high-level, the "compiler" can substitute totally different algorithms, apply multiple processors or I/O channels or entire servers transparently, and more.
我认为Haskell是一样的.您可能认为,您只是要求Haskell将输入列表映射到第二个列表,将第二个列表过滤到第三个列表,然后计算产生的项目数.但是您看不到GHC在后台应用流融合重写规则,而是将整个过程转换成一个紧密的机器代码循环,该循环可以在没有分配的情况下一次通过数据来完成整个工作-那种用手工编写会很乏味,容易出错且无法维护的东西.因为代码中缺少底层细节,所以这真的是可能的.
I think of Haskell as being the same. You might think you just asked Haskell to map the input list to a second list, filter the second list into a third list, and then count how many items resulted. But you didn't see GHC apply stream-fusion rewrite rules behind the scenes, transforming the entire thing into a single tight machine code loop that does the whole job in a single pass over the data with no allocation — the kind of thing that would be tedious, error-prone and non-maintainable to write by hand. That's only really possible because of the lack of low-level details in the code.
另一种看待它的方式可能是…为什么不应该 Haskell快?它会做什么使它变慢?
Another way to look at it might be… why shouldn't Haskell be fast? What does it do that should make it slow?
它不是Perl或JavaScript这样的解释性语言.它甚至不是像Java或C#这样的虚拟机系统.它一直编译到本机代码,因此没有任何开销.
It's not an interpreted language like Perl or JavaScript. It's not even a virtual machine system like Java or C#. It compiles all the way down to native machine code, so no overhead there.
与OO语言[Java,C#,JavaScript…]不同,Haskell具有完整类型擦除(例如C,C ++,Pascal…].所有类型检查仅在编译时进行.因此,也没有运行时类型检查来减慢您的速度. (对此,没有空指针检查.例如,在Java中,JVM必须检查空指针并在引用时引用一个异常.Haskell不必为此进行检查.)
Unlike OO languages [Java, C#, JavaScript…], Haskell has full type erasure [like C, C++, Pascal…]. All type checking happens at compile-time only. So there's no run-time type-checking to slow you down either. (No null-pointer checks, for that matter. In, say, Java, the JVM must check for null pointers and throw an exception if you deference one. Haskell doesn't have to bother with that check.)
您说在运行时动态创建函数"听起来很慢,但是如果仔细看,实际上并没有这样做.您可能会看起来像,但事实并非如此.如果您说(+5)
,那么这已硬编码到您的源代码中.它不能在运行时更改.因此,它并不是真正的动态功能.即使是咖喱函数也实际上只是将参数保存到数据块中.实际上,所有可执行代码都在编译时存在.没有运行时解释. (与其他具有评估函数"的语言不同.)
You say it sounds slow to "create functions on the fly at run-time", but if you look very carefully, you don't actually do that. It might look like you do, but you don't. If you say (+5)
, well, that's hard-coded into your source code. It cannot change at run-time. So it's not really a dynamic function. Even curried functions are really just saving parameters into a data block. All the executable code actually exists at compile-time; there is no run-time interpretation. (Unlike some other languages that have an "eval function".)
想想Pascal.它很旧,没有人真正使用它,但是没有人会抱怨Pascal 慢.关于它,有很多令人不快的事情,但慢速并不是其中之一. Haskell实际上并没有做与Pascal不同的事情,除了具有垃圾回收而不是手动内存管理.不变的数据允许对GC引擎进行多项优化(这种惰性评估随后会使情况有些复杂).
Think about Pascal. It's old and nobody really uses it any more, but nobody would complain that Pascal is slow. There are plenty of things to dislike about it, but slowness is not really one of them. Haskell isn't really doing that much that's different to Pascal, other than having garbage collection rather than manual memory management. And immutable data allows several optimisations to the GC engine [which lazy evaluation then complicates somewhat].
我认为问题是Haskell 看起来先进,精巧和高级,每个人都认为哦,哇,这真的很强大,它一定很慢! >但事实并非如此.或者至少,它不是您期望的那样.是的,它有一个了不起的类型系统.但是你知道吗?这一切都是在编译时发生的.在运行时,它已经消失了.是的,它允许您使用一行代码来构造复杂的ADT.但是你知道吗? ADT只是struct
的普通普通C union
.没什么.
I think the thing is that Haskell looks advanced and sophisticated and high-level, and everybody thinks "oh wow, this is really powerful, it must be amazingly slow!" But it isn't. Or at least, it isn't in the way you'd expect. Yes, it's got an amazing type system. But you know what? That all happens at compile-time. By run-time, it's gone. Yes, it allows you to construct complicated ADTs with a line of code. But you know what? An ADT is just a plain ordinary C union
of struct
s. Nothing more.
真正的杀手is是懒惰的评价.当您正确地执行了代码的严格性/惰性时,您可以编写出愚蠢而又快速又优雅又美丽的代码.但是,如果您弄错了这些东西,您的程序就会变慢数千倍,而这种情况发生的原因真的不是很明显.
The real killer is lazy evaluation. When you get the strictness / laziness of your code right, you can write stupidly fast code that is still elegant and beautiful. But if you get this stuff wrong, your program goes thousands of times slower, and it's really non-obvious why this is happening.
例如,我编写了一个简单的小程序来计算每个字节出现在文件中的次数.对于25KB的输入文件,该程序运行了 20分钟,并吞没了 6 GB 的RAM!太荒谬了!但是后来我意识到了问题所在,添加了一个爆炸样式,运行时间降至 0.02秒.
For example, I wrote a trivial little program to count how many times each byte appears in a file. For a 25KB input file, the program took 20 minutes to run and swallowed 6 gigabytes of RAM! That's absurd!! But then I realized what the problem was, added a single bang-pattern, and the run-time dropped to 0.02 seconds.
这是Haskell出人意料地缓慢前进的地方.而且要花一些时间才能习惯它.但是随着时间的流逝,编写真正快速的代码变得更加容易.
This is where Haskell goes unexpectedly slowly. And it sure takes a while to get used to it. But over time, it gets easier to write really fast code.
是什么让Haskell如此之快?纯度.静态类型.懒惰.但最重要的是,它具有足够高的层次,可以使编译器从根本上改变实现而不会破坏代码的期望.
What makes Haskell so fast? Purity. Static types. Laziness. But above all, being sufficiently high-level that the compiler can radically change the implementation without breaking your code's expectations.
但是我想那只是我的看法...
But I guess that's just my opinion...
这篇关于为什么Haskell(GHC)这么快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!