本文介绍了根据另一个变量的老化情况生成一个新变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个像下面这样的数据集

I have a dataset likes like below

ID. Invoice. Date of Invoice.  paid or not.

1    1         10/31/2019       yes
1    1         10/31/2019       yes
1    2         11/30/2019       no
1    3         12/31/2019       no

2    1         09/30/2019       no
2    2         10/30/2019       no
2    3         11/30/2019       yes

3    1         7/31/2019        no
3    2         9/30/2019        yes
3    3         12/31/2019       no

4    1         7/31/2019        yes
4    2         9/30/2019        no
4    3         12/31/2019       yes

我想知道客户是否愿意付款。只要客户支付了新发票而未支付的旧发票,我就会给他一个很好的分数。因此对于客户1和客户3,我给的评价是好,客户2的评价是差。

I would like to know whether the customers' willingness to pay. As long as a customer has paid a new invoice with an old invoice not paid, I will give him a good score. so for customer 1 and 3, I gave "good", customer 2 is a "bad" score.

,因此最终数据将再增加一列,其值包括好坏。

so the final data will have one more column, with values of good and bad.

ID。发票。发票日期。是否付款。不好或好

ID. Invoice. Date of Invoice. paid or not. Bad or good

1    1         10/31/2019       yes          bad
1    1         10/31/2019       yes          bad
1    2         11/30/2019       no           bad
1    3         12/31/2019       no           bad

2    1         09/30/2019       no           good
2    2         10/30/2019       no           good
2    3         11/30/2019       yes          good

3    1         7/31/2019        no           good
3    2         9/30/2019        yes          good
3    3         12/31/2019       no           good

4    1         7/31/2019        yes          good
4    2         9/30/2019        no           good
4    3         12/31/2019       yes          good


推荐答案

假设您的发票日期。已订购,然后这里是使用 ave

Assuming your Date of Invoice. is ordered already, then here is a base R solution using ave

df$`good or band.` <- ave(df$`paid or not.`,df$ID., FUN = function(v) ifelse(which(v=="yes")==1,"bad","good"))
> df
  ID. Invoice. Date of Invoice. paid or not. good or band.
1   1        1       09/30/2019           no          good
2   1        2       10/30/2019           no          good
3   1        3       11/30/2019          yes          good
4   2        1       10/31/2019          yes           bad
5   2        2       11/30/2019           no           bad
6   2        3       12/31/2019           no           bad
7   3        1        7/31/2019           no          good
8   3        2        9/30/2019          yes          good
9   3        3       12/31/2019           no          good

DATA

df <- structure(list(ID. = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), Invoice. = c(1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), `Date of Invoice.` = c("09/30/2019",
"10/30/2019", "11/30/2019", "10/31/2019", "11/30/2019", "12/31/2019",
"7/31/2019", "9/30/2019", "12/31/2019"), `paid or not.` = c("no",
"no", "yes", "yes", "no", "no", "no", "yes", "no")), class = "data.frame", row.names = c(NA,
-9L))

这篇关于根据另一个变量的老化情况生成一个新变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 23:12