问题描述
我目前正在寻找Clojure和Incanter作为R的替代品(不是我不喜欢R,但它只是有趣的尝试新的语言。)我喜欢Incanter和发现语法吸引人,但矢量化操作是相当比较慢到R或Python。
例如,我想使用Incanter向量操作,Clojure map和R来获得向量
的一阶差分。下面是所有
版本的代码和时间。你可以看到R显然更快。
Incanter和Clojure:
))
(def x(doall(sample-normal 1e7)))
(time(def y(doall(minus(rest x)(butlast x))))
Elapsed时间:16481.337 msecs
(time(def y(doall(map - (rest x)(butlast x))))
经过时间:16457.850 msecs
R:
rdiff< - function(x){
n = length(x)
x [2:n] - x [1:(n-1)]}
x = rnorm(1e7)
.time(rdiff(x))
用户系统已过
1.504 0.900 2.561
所以我想知道有没有办法加快矢量操作在Incanter / Clojure?此外,欢迎使用来自Clojure的循环,Java数组和/或库的解决方案。
我也将此问题发布到Incanter Google群组,
更新:我已将Jouni的回答标记为已接受,请参阅下面的自我回答,我已清理他的代码,
这里是一个Java数组实现,在我的系统上比你的R代码(YMMV)更快。注意启用反射警告,这在优化性能时是重要的,以及对y的重复类型提示(def上的一个似乎没有帮助aset),并将所有内容转换为原始的double值(dotime确保i是一个原始的int)。
(set!* warn-on-reflection * true)
pre>
incanter.stats)
(def ^[Dx(double-array(sample-normal 1e7)))
(time
(do
Dy(double(array(dec(count x))))
(dotimes [i(dec(count x))]
(aset ^[Dy
i
(double( - (double(aget x(inc i)))
(double(aget xi))))))))
I'm currently looking into Clojure and Incanter as an alternative to R. (Not that I dislike R, but it just interesting to try out new languages.) I like Incanter and find the syntax appealing, but vectorized operations are quite slow as compared e.g. to R or Python.
As an example I wanted to get the first order difference of a vector using Incanter vector operations, Clojure map and R . Below is the code and timing for all versions. As you can see R is clearly faster.
Incanter and Clojure:
(use '(incanter core stats)) (def x (doall (sample-normal 1e7))) (time (def y (doall (minus (rest x) (butlast x))))) "Elapsed time: 16481.337 msecs" (time (def y (doall (map - (rest x) (butlast x))))) "Elapsed time: 16457.850 msecs"
R:
rdiff <- function(x){ n = length(x) x[2:n] - x[1:(n-1)]} x = rnorm(1e7) system.time(rdiff(x)) user system elapsed 1.504 0.900 2.561
So I was wondering is there a way to speed up the vector operations in Incanter/Clojure? Also solutions involving the use of loops, Java arrays and/or libraries from Clojure are welcome.
I have also posted this question to Incanter Google group with no responses so far.
UPDATE: I have marked Jouni's answer as accepted, see below for my own answer where I have cleaned up his code a bit and added some benchmarks.
解决方案Here's a Java arrays implementation that is on my system faster than your R code (YMMV). Note enabling the reflection warnings, which is essential when optimizing for performance, and the repeated type hint on y (the one on the def didn't seem to help for the aset) and casting everything to primitive double values (the dotimes makes sure that i is a primitive int).
(set! *warn-on-reflection* true) (use 'incanter.stats) (def ^"[D" x (double-array (sample-normal 1e7))) (time (do (def ^"[D" y (double-array (dec (count x)))) (dotimes [i (dec (count x))] (aset ^"[D" y i (double (- (double (aget x (inc i))) (double (aget x i))))))))
这篇关于快速矢量数学在Clojure / Incanter的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!