本文介绍了根据另一个变量的老化情况生成一个新变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个像下面这样的数据集
I have a dataset likes like below
ID. Invoice. Date of Invoice. paid or not.
1 1 10/31/2019 yes
1 1 10/31/2019 yes
1 2 11/30/2019 no
1 3 12/31/2019 no
2 1 09/30/2019 no
2 2 10/30/2019 no
2 3 11/30/2019 yes
3 1 7/31/2019 no
3 2 9/30/2019 yes
3 3 12/31/2019 no
4 1 7/31/2019 yes
4 2 9/30/2019 no
4 3 12/31/2019 yes
我想知道客户是否愿意付款。只要客户支付了新发票而未支付的旧发票,我就会给他一个很好的分数。因此对于客户1和客户3,我给的评价是好,客户2的评价是差。
I would like to know whether the customers' willingness to pay. As long as a customer has paid a new invoice with an old invoice not paid, I will give him a good score. so for customer 1 and 3, I gave "good", customer 2 is a "bad" score.
,因此最终数据将再增加一列,其值包括好坏。
so the final data will have one more column, with values of good and bad.
ID。发票。发票日期。是否付款。不好或好
ID. Invoice. Date of Invoice. paid or not. Bad or good
1 1 10/31/2019 yes bad
1 1 10/31/2019 yes bad
1 2 11/30/2019 no bad
1 3 12/31/2019 no bad
2 1 09/30/2019 no good
2 2 10/30/2019 no good
2 3 11/30/2019 yes good
3 1 7/31/2019 no good
3 2 9/30/2019 yes good
3 3 12/31/2019 no good
4 1 7/31/2019 yes good
4 2 9/30/2019 no good
4 3 12/31/2019 yes good
推荐答案
假设您的发票日期。
已订购,然后这里是使用 ave
Assuming your Date of Invoice.
is ordered already, then here is a base R solution using ave
df$`good or band.` <- ave(df$`paid or not.`,df$ID., FUN = function(v) ifelse(which(v=="yes")==1,"bad","good"))
> df
ID. Invoice. Date of Invoice. paid or not. good or band.
1 1 1 09/30/2019 no good
2 1 2 10/30/2019 no good
3 1 3 11/30/2019 yes good
4 2 1 10/31/2019 yes bad
5 2 2 11/30/2019 no bad
6 2 3 12/31/2019 no bad
7 3 1 7/31/2019 no good
8 3 2 9/30/2019 yes good
9 3 3 12/31/2019 no good
DATA
df <- structure(list(ID. = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), Invoice. = c(1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), `Date of Invoice.` = c("09/30/2019",
"10/30/2019", "11/30/2019", "10/31/2019", "11/30/2019", "12/31/2019",
"7/31/2019", "9/30/2019", "12/31/2019"), `paid or not.` = c("no",
"no", "yes", "yes", "no", "no", "no", "yes", "no")), class = "data.frame", row.names = c(NA,
-9L))
这篇关于根据另一个变量的老化情况生成一个新变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!