问题描述
请考虑以下程序.
import Control.Parallel
import Control.Parallel.Strategies
main = do
r <- evaluate (runEval test)
print r
test = do
a <- rpar (fib 36)
b <- rpar (fib 37)
return (a,b)
fib :: Int -> Integer
fib 0 = 0
fib 1 = 1
fib n = fib (n-1) + fib (n-2)
请注意,我们故意使用效率低下的Fibonacci生成器来使计算花费一些时间.现在,我们使用GHC编译该程序:
Note that we deliberately use an inefficient implementation of Fibonacci generator to make computations take some time. Now we compile this program with GHC:
$ ghc rpar.hs -threaded
该程序需要 3.281s (带有 + RTS -N1
标志)和 2.126s (带有 + RTS -N2
标志)旗帜.现在我们用 main
函数中的 print"hello"
替换 print r
,并以相同的方式编译修改后的程序.新程序使用 + RTS -N1
花费 0.003s ,使用 + RTS -N2
花费 0.004s .似乎 test
函数中的 a
和 b
不是在新程序中计算的.
The program takes 3.281s with +RTS -N1
flag and 2.126s with +RTS -N2
flag. Now we replace print r
with print "hello"
in main
function and compile the modified program in the same way. New program takes 0.003s with +RTS -N1
and 0.004s with +RTS -N2
. It seems that a
and b
in test
function are not computed in new program.
然后,我们以 rpar/rseq
样式修改 test
函数:
Then, we modify test
function in rpar/rseq
style:
test = do
a <- rpar (fib 36)
b <- rseq (fib 37)
return (a,b)
我们对该程序进行了相同的实验.(1)在 main
函数中的 print r
:该程序需要 3.283s 和 2.138s 用于 +RTS -N1
和 + RTS -N2
标志;(2)在 main
函数中打印"hello"
:该程序需要 1.956s 和 2.025s 用于 + RTS -N1
和 + RTS -N2
.显然,在这种情况下,将计算 test
中的 a
或 b
.
We conduct the same experiment on this program. (1) print r
in main
function: the program takes 3.283s and 2.138s for +RTS -N1
and +RTS -N2
flags respectively; (2) print "hello"
in main
function: the program takes 1.956s and 2.025s for +RTS -N1
and +RTS -N2
respectively. Obviously, either of a
or b
in test
is computed in this case.
此示例中有两个问题:
(1)何时在程序中实际计算 rpar
和 rseq
表达式?似乎在评估 a&r-rpar(fib 36)
时不会立即计算出(fib 36)
.
(1) when are rpar
and rseq
expressions actually computed in program? It seems that (fib 36)
is not instantly computed when a <- rpar (fib 36)
is evaluated.
(2)如果一台机器具有足够的CPU内核,并且我们在运行程序时指定了 + RTS -N2
标志,则是 a
和 b
确保同时(或几乎同时)启动?
(2) if a machine has sufficient CPU cores and we specify +RTS -N2
flag when running the program, are computations of a
and b
guaranteed to start simultaneously (or almost simultaneously) ?
推荐答案
-
更大的问题是:什么时候在Haskell中计算?
The bigger question is: when are things computed in Haskell?
由于Haskell是一种惰性语言,因此我们仅在需要某事物时才对其进行计算.特别是在第一个程序中,如果您不需要 r
( evaluate
只会将其计算到 fib
的(,)
构造函数中呼叫永远不会被计算,而并行性只是开销.
Since Haskell is a lazy language we compute a thing only when that thing is demanded. Specifically in your first program if you never demand r
(evaluate
will only compute it to the (,)
constructor) the fib
calls are never computed and the parallelism is only overhead.
在您的第二个程序 rseq
中,将要求包含计算的结果,因此无论是否 print
,都将对其进行计算.
In you second program rseq
will demand the result of the containing computation and therefore it is computed regardless of if you print
it or not.
使用 + RTS -N2
运行程序将使运行时使用2个Haskell执行上下文(HEC).计算 a
和 b
将添加到火花池中,并且可供两个HEC进行计算.接下来发生的是GHC运行时魔术,但是如果计算不是太简单,则可以假定每个HEC都会产生一个火花并对其进行计算.
Running the program with +RTS -N2
will make the runtime use 2 Haskell Execution Contexts (HECs). The computations a
and b
will be added to a spark pool and will be available for either HEC to compute. What happens next is GHC Runtime magic, but if the computations are not too simple you can assume that each HEC will take one spark and compute it.
有关RTS的更多信息,请查看以下论文集: https://ghc.haskell.org/trac/ghc/wiki/ReadingList#DataParallelHaskellandconcurrency (以及运行时间部分),还有可能是Simon Marlow的 http://chimera.labs.oreilly.com/books/1230000000929
For more reading on the RTS check out this collection of papers: https://ghc.haskell.org/trac/ghc/wiki/ReadingList#DataParallelHaskellandconcurrency (and the run time section), and possible Simon Marlow's http://chimera.labs.oreilly.com/books/1230000000929
这篇关于什么时候在Haskell程序中实际计算rpar和rseq表达式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!