问题描述
以下数据用于进行比较分析.我使用apply()
和while()
编写了代码,尽管它可以按预期工作,但在进一步优化它方面并未取得成功.在较大的数据集中,当前运行时间要超过几个小时.
The data below is being used to perform comparative analysis. I wrote the code using apply()
and while()
, and even though it works as expected, I haven't been successful in optimizing it further. Current run time is more than couple of hours in larger data set.
以下是一个小的示例数据集:
Following is small example data set:
数据_1
A B C D
2 1 3 2.5
数据_2
P Q R S
3 2 4 5.5
数据
A B C D
1.0 0.5 1.3 1.5
1.5 1.2 5.5 3.5
1.1 0.5 1.3 1.5
1.5 1.2 5.5 3.5
1.5 1.2 5.5 3.5
1.1 0.5 1.3 1.5
1.5 1.2 5.5 3.5
1.0 0.5 1.3 1.5
代码
# Row counter
rowLine <<- 0
# Set current column to first one
columnLine <<- 1
# Preserve column header and dimensions for final data
finalData <- Data
# Find recursively
findThreshold <- function () {
if ( columnLine <= ncol(Data) ){
# Initialize row navigation to zero
rowLine <<- 1
# Navigate through rows
while (rowLine <= nrow(Data)){
# If outside threshold
if ( (Data[rowLine, columnLine] < data_1[columnLine]) |
(Data[rowLine, columnLine] > data_2[columnLine])){
finalData[rowLine, columnLine] <<- 1
} else {
finalData[rowLine, columnLine] <<- 0
}
# Increment row counter
rowLine <<- rowLine + 1
}
}
# Increment column counter
columnLine <<- columnLine + 1
}
# Apply
apply(Data, 2, function(x) findThreshold())
我还理解,将<<-
与loops
一起使用并像apply()
那样进行递归分析时,这是一个很大的缺点.
I also understand using <<-
is a big no when it comes using it with loops
and recursively analysis like apply()
.
谢谢,我建议如何进一步改善这种逻辑.
Please suggest how I can improve this logic further, thanks.
推荐答案
听起来像一个简单的Map
练习:
Sounds like a simple Map
exercise:
data.frame(Map(function(d,l,h) d < l | d > h, Data, data_1, data_2))
# A B C D
#1 TRUE TRUE TRUE TRUE
#2 TRUE FALSE TRUE FALSE
#3 TRUE TRUE TRUE TRUE
#4 TRUE FALSE TRUE FALSE
#5 TRUE FALSE TRUE FALSE
#6 TRUE TRUE TRUE TRUE
#7 TRUE FALSE TRUE FALSE
#8 TRUE TRUE TRUE TRUE
如果要使用0/1输出,只需将逻辑比较包装在as.integer
中:
Just wrap the logical comparison in as.integer
if you want a 0/1 output instead:
data.frame(Map(function(d,l,h) as.integer(d < l | d > h), Data, data_1, data_2))
如果数据是以matrix
对象开头的,则可以使用sweep
:
If your data are matrix
objects to start with, you could use sweep
:
sweep(Data, 2, data_1, FUN=`<`) | sweep(Data, 2, data_2, FUN=`>`)
# A B C D
#[1,] TRUE TRUE TRUE TRUE
#[2,] TRUE FALSE TRUE FALSE
#[3,] TRUE TRUE TRUE TRUE
#[4,] TRUE FALSE TRUE FALSE
#[5,] TRUE FALSE TRUE FALSE
#[6,] TRUE TRUE TRUE TRUE
#[7,] TRUE FALSE TRUE FALSE
#[8,] TRUE TRUE TRUE TRUE
这篇关于在R中优化Apply()While()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!