我有以下数据集:
observation <- c(1:10)
pop.d.rank <- c(1:10)
cost.1 <- c(101:110)
cost.2 <- c(102:111)
cost.3 <- c(103:112)
all <- data.frame(observation,pop.d.rank,cost.1, cost.2, cost.3)
我想在三年内分配以下金额:
annual.investment <- 500
我可以使用以下脚本在第一年做到这一点:
library(dplyr)
all <- all %>%
mutate(capital_allocated.5G = diff(c(0, pmin(cumsum(cost), annual.investment)))) %>%
mutate(capital_percentage.5G = capital_allocated.5G / cost * 100) %>%
mutate(year = ifelse(capital_percentage.5G >= 50, "Year.1",0))
但是,当我尝试第二年执行此操作时,考虑到前一年的投资,该代码不起作用。这是我尝试将ifelse语句放入mutate循环中,以使其不会覆盖上一年分配的金额:
all <- all %>%
mutate(capital_allocated.5G = ifelse(year == 0, diff(c(0, pmin(cumsum(cost), annual.investment))), 0) %>%
mutate(capital_percentage.5G = capital_allocated.5G / cost * 100) %>%
mutate(year = ifelse(capital_percentage.5G >= 50, "Year.2",0))
我希望数据看起来像下面这样,其中分配的数量首先到达与上一年相比尚未完成100%的任何行。
capital_allocated.5G <- c(101, 102, 103, 104, 105, 106, 107, 108, 109, 55)
capital_percentage.5G <- c(100, 100, 100, 100, 100, 100, 100, 100, 100, 50)
year <- c("Year.1", "Year.1","Year.1", "Year.1","Year.1", "Year.2", "Year.2","Year.2", "Year.2","Year.2")
example.output <- data.frame(observation,pop.d.rank,cost, capital_allocated.5G, capital_percentage.5G, year)
编辑:cost.1是第1年的成本变量,cost.2是第2年的变量,而cost.3是第3年的成本变量
编辑:以前接受的答案是问题
我已经意识到,这最终为capital_percentage.5G变量分配了100多个。我创建了一个可重现的示例。我认为这与以下事实有关:有些成本会随着时间的流逝而减少,而有些成本会随着时间的流逝而增加。
这背后的逻辑是,如果在一年内进行投资,则5G移动网络的部署成本是特定的,这就是该时间点的成本列所涉及的。一年内完成该投资后,我希望该功能提供100%的capital_percentage.5G,然后在以后的几年中不再向其分配任何资本。
我如何获得该百分比值,使其达到100的上限,并且以后不再分配更多的资本分配?
observation <- c(1:10)
pop.d.rank <- c(1:10)
cost.1 <- c(101:110)
cost.2 <- c(110:101)
cost.3 <- c(100:91)
all <- data.frame(observation,pop.d.rank,cost.1, cost.2, cost.3)
capital_allocated.5G <- rep(0,10) ## initialize to zero
capital_percentage.5G <- rep(0,10) ## initialize to zero
year <- rep(NA,10) ## initialize to NA
all <- data.frame(observation,pop.d.rank,cost.1, cost.2, cost.3, capital_allocated.5G,capital_percentage.5G,year)
alloc.invest <- function(df, ann.invest, y) {
df %>% mutate_(cost=paste0("cost.",y)) %>%
mutate(capital_percentage.5G = capital_allocated.5G / cost * 100,
year = ifelse(capital_percentage.5G < 50, NA, year),
not.yet.alloc = ifelse(capital_percentage.5G < 100,cost-capital_allocated.5G,0),
capital_allocated.5G = capital_allocated.5G + ifelse(capital_percentage.5G < 100,diff(c(0, pmin(cumsum(not.yet.alloc), ann.invest))), 0),
capital_percentage.5G = capital_allocated.5G / cost * 100,
year = ifelse(is.na(year) & capital_percentage.5G >= 50, paste0("Year.",y), year)) %>%
select(-cost,-not.yet.alloc)
}
annual.investment <- 500
all <- alloc.invest(all,annual.investment,1)
print(all)
all <- alloc.invest(all,annual.investment,2)
print(all)
all <- alloc.invest(all,annual.investment,3)
print(all)
第三年,在这里的最终投资分配中,capital_percentage.5G突然飙升至110%。
最佳答案
更新了可能会增加或减少的同比费用
对于每年可能减少或增加的不同成本,我们根本不需要在更新capital_percentage.5G
和not.yet.alloc
时检查capital_allocated.5G
是否超过100%:
library(dplyr)
alloc.invest <- function(df, ann.invest, y) {
df %>% mutate_(cost=paste0("cost.",y)) %>%
mutate(capital_percentage.5G = capital_allocated.5G / cost * 100,
year = ifelse(capital_percentage.5G < 50, NA, year),
not.yet.alloc = cost-capital_allocated.5G,
capital_allocated.5G = capital_allocated.5G + diff(c(0, pmin(cumsum(not.yet.alloc), ann.invest))),
capital_percentage.5G = capital_allocated.5G / cost * 100,
year = ifelse(is.na(year) & capital_percentage.5G >= 50, paste0("Year.",y), year)) %>%
select(-cost,-not.yet.alloc)
}
使用新的费用数据:observation <- c(1:10)
pop.d.rank <- c(1:10)
cost.1 <- c(101:110)
cost.2 <- c(110:101)
cost.3 <- c(100:91)
与以前一样的初始值列增强:capital_allocated.5G <- rep(0,10) ## initialize to zero
capital_percentage.5G <- rep(0,10) ## initialize to zero
year <- rep(NA,10) ## initialize to NA
all <- data.frame(observation,pop.d.rank,cost.1, cost.2, cost.3, capital_allocated.5G,capital_percentage.5G,year)
第1年:annual.investment <- 500
all <- alloc.invest(all,annual.investment,1)
print(all)
## observation pop.d.rank cost.1 cost.2 cost.3 capital_allocated.5G capital_percentage.5G year
##1 1 1 101 110 100 101 100.00000 Year.1
##2 2 2 102 109 99 102 100.00000 Year.1
##3 3 3 103 108 98 103 100.00000 Year.1
##4 4 4 104 107 97 104 100.00000 Year.1
##5 5 5 105 106 96 90 85.71429 Year.1
##6 6 6 106 105 95 0 0.00000 <NA>
##7 7 7 107 104 94 0 0.00000 <NA>
##8 8 8 108 103 93 0 0.00000 <NA>
##9 9 9 109 102 92 0 0.00000 <NA>
##10 10 10 110 101 91 0 0.00000 <NA>
第二年:all <- alloc.invest(all,annual.investment,2)
print(all)
## observation pop.d.rank cost.1 cost.2 cost.3 capital_allocated.5G capital_percentage.5G year
##1 1 1 101 110 100 110 100.00000 Year.1
##2 2 2 102 109 99 109 100.00000 Year.1
##3 3 3 103 108 98 108 100.00000 Year.1
##4 4 4 104 107 97 107 100.00000 Year.1
##5 5 5 105 106 96 106 100.00000 Year.1
##6 6 6 106 105 95 105 100.00000 Year.2
##7 7 7 107 104 94 104 100.00000 Year.2
##8 8 8 108 103 93 103 100.00000 Year.2
##9 9 9 109 102 92 102 100.00000 Year.2
##10 10 10 110 101 91 46 45.54455 <NA>
第三年:all <- alloc.invest(all,annual.investment,3)
print(all)
## observation pop.d.rank cost.1 cost.2 cost.3 capital_allocated.5G capital_percentage.5G year
##1 1 1 101 110 100 100 100 Year.1
##2 2 2 102 109 99 99 100 Year.1
##3 3 3 103 108 98 98 100 Year.1
##4 4 4 104 107 97 97 100 Year.1
##5 5 5 105 106 96 96 100 Year.1
##6 6 6 106 105 95 95 100 Year.2
##7 7 7 107 104 94 94 100 Year.2
##8 8 8 108 103 93 93 100 Year.2
##9 9 9 109 102 92 92 100 Year.2
##10 10 10 110 101 91 91 100 Year.3
您的代码的原始问题是
ifelse
仅根据条件而不是cost
的TRUE
分支中使用的输入ifelse
提供对输出的开关。因此,cumsum(cost)
不仅会在cumsum
的cost
分支的一部分上,还会在所有TRUE
上计算ifelse
。为了解决这个问题,我们可以定义以下函数,然后每年依次执行。library(dplyr)
alloc.invest <- function(df, ann.invest, y) {
df %>% mutate(not.yet.alloc = ifelse(capital_percentage.5G < 100,cost-capital_allocated.5G,0),
capital_allocated.5G = capital_allocated.5G + ifelse(capital_percentage.5G < 100,diff(c(0, pmin(cumsum(not.yet.alloc), ann.invest))), 0),
capital_percentage.5G = capital_allocated.5G / cost * 100,
year = ifelse(is.na(year) & capital_percentage.5G >= 50, paste0("Year.",y), year)) %>%
select(-not.yet.alloc)
}
笔记:not.yet.alloc
,据此我们可以计算出当年分配的结果cumsum
。 mutate
语句。 is.na(year)
之前还需要检查year
。否则,先前已标记的year
将被覆盖。 要使用此功能,我们必须首先使用
capital_allocated.5G
,capital_percentage.5G
和year
的一些初始值扩充输入数据:capital_allocated.5G <- rep(0,10) ## initialize to zero
capital_percentage.5G <- rep(0,10) ## initialize to zero
year <- rep(NA,10) ## initialize to NA
all <- data.frame(observation,pop.d.rank,cost,capital_allocated.5G,capital_percentage.5G,year)
然后是一年级:annual.investment <- 500
all <- alloc.invest(all,annual.investment,1)
print(all)
## observation pop.d.rank cost capital_allocated.5G capital_percentage.5G year
##1 1 1 101 101 100.00000 Year.1
##2 2 2 102 102 100.00000 Year.1
##3 3 3 103 103 100.00000 Year.1
##4 4 4 104 104 100.00000 Year.1
##5 5 5 105 90 85.71429 Year.1
##6 6 6 106 0 0.00000 <NA>
##7 7 7 107 0 0.00000 <NA>
##8 8 8 108 0 0.00000 <NA>
##9 9 9 109 0 0.00000 <NA>
##10 10 10 110 0 0.00000 <NA>
第二年:all <- alloc.invest(all,annual.investment,2)
print(all)
## observation pop.d.rank cost capital_allocated.5G capital_percentage.5G year
##1 1 1 101 101 100 Year.1
##2 2 2 102 102 100 Year.1
##3 3 3 103 103 100 Year.1
##4 4 4 104 104 100 Year.1
##5 5 5 105 105 100 Year.1
##6 6 6 106 106 100 Year.2
##7 7 7 107 107 100 Year.2
##8 8 8 108 108 100 Year.2
##9 9 9 109 109 100 Year.2
##10 10 10 110 55 50 Year.2
更新每年更改成本的新要求
如果每年的费用不同,则该函数需要首先重新调整
capital_percentage.5G
,可能还需要重新调整year
列:library(dplyr)
alloc.invest <- function(df, ann.invest, y) {
df %>% mutate_(cost=paste0("cost.",y)) %>%
mutate(capital_percentage.5G = capital_allocated.5G / cost * 100,
year = ifelse(capital_percentage.5G < 50, NA, year),
not.yet.alloc = ifelse(capital_percentage.5G < 100,cost-capital_allocated.5G,0),
capital_allocated.5G = capital_allocated.5G + ifelse(capital_percentage.5G < 100,diff(c(0, pmin(cumsum(not.yet.alloc), ann.invest))), 0),
capital_percentage.5G = capital_allocated.5G / cost * 100,
year = ifelse(is.na(year) & capital_percentage.5G >= 50, paste0("Year.",y), year)) %>%
select(-cost,-not.yet.alloc)
}
请注意,使用cost
创建另一个临时列mutate_
只是为了方便,因为需要根据输入的y
动态选择cost列(否则,我们需要对所有计算使用mutate_
,这会有些麻烦。)对于更新后的数据,第1年的
capital_allocated.5G
,capital_percentage.5G
和year
的初始值类似地增加了:annual.investment <- 500
all <- alloc.invest(all,annual.investment,1)
print(all)
## observation pop.d.rank cost.1 cost.2 cost.3 capital_allocated.5G capital_percentage.5G year
##1 1 1 101 102 103 101 100.00000 Year.1
##2 2 2 102 103 104 102 100.00000 Year.1
##3 3 3 103 104 105 103 100.00000 Year.1
##4 4 4 104 105 106 104 100.00000 Year.1
##5 5 5 105 106 107 90 85.71429 Year.1
##6 6 6 106 107 108 0 0.00000 <NA>
##7 7 7 107 108 109 0 0.00000 <NA>
##8 8 8 108 109 110 0 0.00000 <NA>
##9 9 9 109 110 111 0 0.00000 <NA>
##10 10 10 110 111 112 0 0.00000 <NA>
第二年:请注意,最后一个 Assets 分配的资源少于50%
,因此其year
仍然是NA
。all <- alloc.invest(all,annual.investment,2)
print(all)
## observation pop.d.rank cost.1 cost.2 cost.3 capital_allocated.5G capital_percentage.5G year
##1 1 1 101 102 103 102 100.00000 Year.1
##2 2 2 102 103 104 103 100.00000 Year.1
##3 3 3 103 104 105 104 100.00000 Year.1
##4 4 4 104 105 106 105 100.00000 Year.1
##5 5 5 105 106 107 106 100.00000 Year.1
##6 6 6 106 107 108 107 100.00000 Year.2
##7 7 7 107 108 109 108 100.00000 Year.2
##8 8 8 108 109 110 109 100.00000 Year.2
##9 9 9 109 110 111 110 100.00000 Year.2
##10 10 10 110 111 112 46 41.44144 <NA>
3年级all <- alloc.invest(all,annual.investment,3)
print(all)
## observation pop.d.rank cost.1 cost.2 cost.3 capital_allocated.5G capital_percentage.5G year
##1 1 1 101 102 103 103 100 Year.1
##2 2 2 102 103 104 104 100 Year.1
##3 3 3 103 104 105 105 100 Year.1
##4 4 4 104 105 106 106 100 Year.1
##5 5 5 105 106 107 107 100 Year.1
##6 6 6 106 107 108 108 100 Year.2
##7 7 7 107 108 109 109 100 Year.2
##8 8 8 108 109 110 110 100 Year.2
##9 9 9 109 110 111 111 100 Year.2
##10 10 10 110 111 112 112 100 Year.3