问题描述
我一直在对R函数进行编码,以计算某些分布的积分,请参见下面的代码.
I have been coding a R function to compute an integral with respect to certain distributions, see code below.
EVofPsi = function(psi, probabilityMeasure, eps=0.01, ...){
distFun = function(u){
probabilityMeasure(u, ...)
}
xx = yy = seq(0,1,length=1/eps+1)
summand=0
for(i in 1:(length(xx)-1)){
for(j in 1:(length(yy)-1)){
signPlus = distFun(c(xx[i+1],yy[j+1]))+distFun(c(xx[i],yy[j]))
signMinus = distFun(c(xx[i+1],yy[j]))+distFun(c(xx[i],yy[j+1]))
summand = c(summand, psi(c(xx[i],yy[j]))*(signPlus-signMinus))
}
}
sum(summand)
}
它工作正常,但速度很慢.经常听到有人用诸如C ++之类的编译语言对函数进行重新编程会加快它的速度,特别是因为上面的R代码涉及一个双循环.我也是这样,使用Rcpp:
It works fine, but it is pretty slow. It is common to hear that re-programming the function in a compiled language such as C++ would speed it up, especially because the R code above involves a double loop. So did I, using Rcpp:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
double EVofPsiCPP(Function distFun, Function psi, int n, double eps) {
NumericVector xx(n+1);
NumericVector yy(n+1);
xx[0] = 0;
yy[0] = 0;
// discretize [0,1]^2
for(int i = 1; i < n+1; i++) {
xx[i] = xx[i-1] + eps;
yy[i] = yy[i-1] + eps;
}
Function psiCPP(psi);
Function distFunCPP(distFun);
double signPlus;
double signMinus;
double summand = 0;
NumericVector topRight(2);
NumericVector bottomLeft(2);
NumericVector bottomRight(2);
NumericVector topLeft(2);
// compute the integral
for(int i=0; i<n; i++){
//printf("i:%d \n",i);
for(int j=0; j<n; j++){
//printf("j:%d \n",j);
topRight[0] = xx[i+1];
topRight[1] = yy[j+1];
bottomLeft[0] = xx[i];
bottomLeft[1] = yy[j];
bottomRight[0] = xx[i+1];
bottomRight[1] = yy[j];
topLeft[0] = xx[i];
topLeft[1] = yy[j+1];
signPlus = NumericVector(distFunCPP(topRight))[0] + NumericVector(distFunCPP(bottomLeft))[0];
signMinus = NumericVector(distFunCPP(bottomRight))[0] + NumericVector(distFunCPP(topLeft))[0];
summand = summand + NumericVector(psiCPP(bottomLeft))[0]*(signPlus-signMinus);
//printf("summand:%f \n",summand);
}
}
return summand;
}
我很高兴,因为此C ++函数可以正常工作.但是,当我测试这两个功能时,C ++的运行速度会变慢:
I'm pretty happy since this C++ function works fine. However, when I tested both functions, the C++ one ran SLOWER:
sourceCpp("EVofPsiCPP.cpp")
pFGM = function(u,theta){
u[1]*u[2] + theta*u[1]*u[2]*(1-u[1])*(1-u[2])
}
psi = function(u){
u[1]*u[2]
}
print(system.time(
for(i in 1:10){
test = EVofPsi(psi, pFGM, 1/100, 0.2)
}
))
test
print(system.time(
for(i in 1:10){
test = EVofPsiCPP(psi, function(u){pFGM(u,0.2)}, 100, 1/100)
}
))
那么,周围有人愿意向我解释一下吗?我是否像猴子一样编写代码,有没有办法加快该功能?此外,我还有第二个问题.确实,我可以用SEXP替换输出类型double,而用SEXP替换参数类型Function,它似乎没有任何改变.那有什么区别?
So, is there some kind expert around willing to explain me this? Did I code like a monkey and is there a way to speed up that function? Moreover, I would have a second question. Indeed, I could have replaced the output type double by SEXP, and the argument types Function by SEXP as well, it doesn't seem to change anything. So what is the difference?
非常感谢您,吉尔达斯
推荐答案
其他人已经在评论中回答了.因此,我只强调这一点:回调R函数非常昂贵,因为我们需要对错误处理更加谨慎.仅在C ++中具有循环并调用R函数并不能在C ++中重写您的代码.尝试使用C ++函数重写 psi
和 pFGM
,并在此处报告发生的情况.
Others have answered in comments already. So I'll just emphasize the point: Calling back to R functions is expensive as we need to be extra cautious about error handling. Just having the loop in C++ and call R functions is not rewriting your code in C++. Try rewriting psi
and pFGM
as C++ functions and report back here what happens.
您可能会争辩说,您失去了一些灵活性,无法再使用任何R函数.对于这种情况,建议您使用某种混合解决方案,在这种情况下,您已经在C ++中实现了最常见的情况,而在其他情况下则回退到R解决方案.
You might argue that you lose some flexibility and you're not able anymore to use any R function. For situations like this, I'd advise to use some sort of hybrid solution where you have implemented the most common cases in C++ and fallback to an R solution otherwise.
关于另一个问题, SEXP
是R对象.这是R API的一部分.可以是任何东西.当您从中创建 Function
时(就像创建带有 Function
参数的函数时所做的隐式操作一样),可以确保它确实是R函数.开销很小,但是在代码的表达能力方面却是巨大的.
As for the other question, a SEXP
is an R object. This is part of the R API. It can be anything. When you create a Function
from it (as is done implicitly for you when create a function that takes a Function
argument), you are guaranteed that this is indeed an R function. The overhead is very small, but the gain in terms of expressiveness of your code is huge.
这篇关于Rcpp函数比相同的R函数慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!