问题描述
我似乎被困在一个非常基本的问题上,我知道这很简单,但我无法弄清楚。
I seemed to be stuck at a very basic problem, I know its easy but I am not able to figure out.
所以我的数据包含HireDate和TermDate。 TermDate是任何员工的最后一天。
So My data has HireDate and TermDate. TermDate is the last day of any employee.
我要按照以下步骤操作:
I want to do as follow:
Leavers =从TermDate中获取的当月计数
特定月份的营业额=当月离职人数/ AVG(上个月和当月的行计数)
复制数据
结构( list(HireDate = structure(c(17702,13242,16895,17167,
12335,13879,12303,13745,14789,16785,15390,17167,12886,
13472,15569,13796,16811, 11484、13062、17592、16113、13437,
15614、17167、17167、16251、17623、13312、14165、17167、17167,
10695、15764、13749、16801、17167、13594、13874, 17167,17167,
13157,17167,12501,13243,12192,12287,12965,13328,17167,
13343,17167,17167,11839,17167,13262,13326,14124,16161,
17167、17226、12786、13823、13822、13255、17704、17653、12258,
12769、13727、10712、17400、13952、14048、14333、17233、17690,
13108、13383、13517、13829、17213、13696、16741、17167、17241,
12198、14018、12902、16801、17167、17591、12843、13627、14553,
15593, 16097,16801,13075,13529,17167),class = Date),TermDate = structure(c(NA,
13439,17712,NA,12880,15408,12877,16493,17135,16944,17135,
NA,14054,15670,17531,14327,NA,13889,NA,NA,16741,17135,
17620,17620,17354,17316,NA,13312,17166,NA,NA,15705,
NA,15112,NA,NA,15705,13970,17655,NA,13612,NA,15418,
15917,15705,NA,14274,13449,NA,13559,17417,NA,14400,
NA,NA,14334、14813、16343、17703,NA,12824、15711、15411,
14484,NA,NA,NA,15309、16493、17197,NA,14911、16957、15882,
NA,NA,14435、13768、13517、14907,NA,17284,NA,NA,NA,12772,
17166,NA,16881、17439,NA,14944,NA,15028、16581, 16778,
NA,13788,14064,17620),class = Date)),row.names = 14296:14395,class = data.frame)
推荐答案
有点冗长,但可以使用:
A bit lengthy but it would work:
library(data.table)
df_leavers <- setDT(df)[, `:=` (TermDate = as.Date(as.character(TermDate)),
HireDate = as.Date(as.character(HireDate)))]
df_presences <- copy(df_leavers)
df_leavers <- df_leavers[, TermDate := format(TermDate, "%Y-%m")][!is.na(TermDate), (Leavers = .N), , by = TermDate]
df_presences <- df_presences[, maxTerm := max(TermDate, na.rm = T)][
is.na(TermDate), TermDate := maxTerm][
, .(YearMonth = format(seq(HireDate, TermDate, by = "month"), "%Y-%m")), by = 1:nrow(df)][
, (Presences = .N), by = YearMonth]
df_final <- df_leavers[df_presences, on = .(TermDate = YearMonth)]
setnames(df_final, c("YearMonth", "Leavers", "Presences"))
df_final <- df_final[is.na(Leavers), Leavers := 0][order(YearMonth),][, previousMonth := shift(Presences)][
is.na(previousMonth), previousMonth := 0][, AvgPresences := (Presences + previousMonth) / 2][
, Turnover := round(Leavers / AvgPresences, 2)][, "previousMonth" := NULL]
输出(数据集的开头和结尾):
Output (beginning and end of dataset):
YearMonth Leavers Presences AvgPresences Turnover
1: 1999-04 0 1 0.5 0.00
2: 1999-05 0 2 1.5 0.00
3: 1999-06 0 2 2.0 0.00
4: 1999-07 0 2 2.0 0.00
5: 1999-08 0 2 2.0 0.00
---
227: 2018-02 0 32 32.5 0.00
228: 2018-03 3 36 34.0 0.09
229: 2018-04 0 33 34.5 0.00
230: 2018-05 1 34 33.5 0.03
231: 2018-06 2 36 35.0 0.06
这篇关于在月营业额上寻找月份的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!