问题描述
...此外, Rscript 是用#!/usr/bin/env Rscript
调用的,而 littler 是用(在我的系统上)脚本文件的第一行.我发现执行速度存在某些差异(例如 littler 似乎慢一些).
...besides the fact that Rscript is invoked with #!/usr/bin/env Rscript
and littler with #!/usr/local/bin/r
(on my system) in first line of script file. I've found certain differences in execution speed (seems like littler is a bit slower).
我创建了两个虚拟脚本,每个脚本运行1000次,并比较了平均执行时间.
I've created two dummy scripts, ran each 1000 times and compared average execution time.
这是Rscript文件:
Here's the Rscript file:
#!/usr/bin/env Rscript
btime <- proc.time()
x <- rnorm(100)
print(x)
print(plot(x))
etime <- proc.time()
tm <- etime - btime
sink(file = "rscript.r.out", append = TRUE)
cat(paste(tm[1:3], collapse = ";"), "\n")
sink()
print(tm)
这是Littler文件:
and here's the littler file:
#!/usr/local/bin/r
btime <- proc.time()
x <- rnorm(100)
print(x)
print(plot(x))
etime <- proc.time()
tm <- etime - btime
sink(file = "little.r.out", append = TRUE)
cat(paste(tm[1:3], collapse = ";"), "\n")
sink()
print(tm)
如您所见,它们几乎是相同的(第一行和接收器文件参数不同).输出被sink
格式化为文本文件,因此使用read.table
导入到R中.我创建了bash脚本来执行每个脚本1000次,然后计算平均值.
As you can see, they are almost identical (first line and sink file argument differ). Output is sink
ed to text file, hence imported in R with read.table
. I've created bash script to execute each script 1000 times, then calculated averages.
这是bash脚本:
for i in `seq 1000`
do
./$1
echo "####################"
echo "Iteration #$i"
echo "####################"
done
结果是:
# littler script
> mean(lit)
user system elapsed
0.489327 0.035458 0.588647
> sapply(lit, median)
L1 L2 L3
0.490 0.036 0.609
# Rscript
> mean(rsc)
user system elapsed
0.219334 0.008042 0.274017
> sapply(rsc, median)
R1 R2 R3
0.220 0.007 0.258
长话短说:除了(明显的)执行时间差异外,还有其他差异吗?更为重要的问题是:为什么/不应该比 Rscript 更喜欢 littler (反之亦然)?
Long story short: beside (obvious) execution-time difference, is there some other difference? More important question is: why should/shouldn't you prefer littler over Rscript (or vice versa)?
推荐答案
快速注释:
-
路径
/usr/local/bin/r
是任意的,您可以像在某些示例中一样使用/usr/bin/env r
.我记得,它限制了您可以为r
提供的其他参数,因为通过env
The path
/usr/local/bin/r
is arbitrary, you can use/usr/bin/env r
as well as we do in some examples. As I recall, it limits what other arguments you can give tor
as it takes only one when invoked viaenv
我不了解您的基准测试,以及您为什么会那样做.我们在源中做了时间比较,请参见tests/timing.sh
和tests/timing2.sh
.也许您想在启动和图形创建之间进行测试,或者进行其他操作.
I don't understand your benchmark, and why you'd do it that way. We do have timing comparisons in the sources, see tests/timing.sh
and tests/timing2.sh
. Maybe you want to split the test between startup and graph creation or whatever you are after.
每当我们进行这些测试时,Littler都会获胜. (当我立即重新运行这些命令时,它仍然会赢.)这对我们来说很有意义,因为如果您查看Rscript.exe
的源代码,则在最终调用execv(cmd, av)
之前,通过设置环境和命令字符串,它的工作原理有所不同. littler可以更快地开始.
Whenever we ran those tests, littler won. (It still won when I re-ran those right now.) Which made sense to us because if you look at the sources to Rscript.exe
, it works different by setting up the environment and a command string before eventually calling execv(cmd, av)
. littler can start a little quicker.
主要价格是便携性. littler的构建方式,不会在Windows中实现.或至少不容易. OTOH,我们已经移植了RInside,所以如果有人真的想要...
The main price is portability. The way littler is built, it won't make it to Windows. Or at least not easily. OTOH we have RInside ported so if someone really wanted to...
Littler在2006年9月排名第一,而Rscript在2007年4月发布了R 2.5.0.
Littler came first in September 2006 versus Rscript which came with R 2.5.0 in April 2007.
Rscript现在无处不在R.那是一个很大的优势.
Rscript is now everywhere where R is. That is a big advantage.
在我看来,命令行选项对于小一些的人更明智.
Command-line options are a little more sensible for littler in my view.
使用CRAN包getopt和optparse进行选项解析.
Both work with CRAN packages getopt and optparse for option parsing.
所以这是个人喜好.我共同写了littler,学到了很多东西(例如Rinside),但仍然发现它有用-所以我每天都要使用数十次.它驱动蔓越莓.它驱动cran2deb.就像您说的那样,您的里程可能会有所不同.
So it's a personal preference. I co-wrote littler, learned a lot doing that (eg for RInside) and still find it useful -- so I use it dozens of times each day. It drives CRANberries. It drives cran2deb. Your mileage may, as hey say, vary.
免责声明:littler是我的项目之一.
Disclaimer: littler is one of my projects.
后记:我将测试写为
我会这样写的
fun <- function { X <- rnorm(100); print(x); print(plot(x)) }
replicate(N, system.time( fun )["elapsed"])
甚至
mean( replicate(N, system.time(fun)["elapsed"]), trim=0.05)
摆脱异常值.而且,您实际上只测量了两者都将从R库中获得的I/O(打印和打印),因此我希望两者之间的差异不大.
to get rid of the outliers. Moreover, you only essentially measure I/O (a print, and a plot) which both will get from the R library so I would expect little difference.
这篇关于Rscript和littler之间的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!