本文介绍了R中的索引功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个读取这样的数据集:

I have a dataset that reads something like this:


  record_id    <- c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3 )

  voucher_number  <- c("app1", "00000", "11111", "22222", "11111", "app2", "33333", "44444", "33333",
                       "33333", "app", "55555", "66666", "55555", "66666", "55555, "77777 )


  ds <- data.frame(record_id, voucher_number, stringsAsFactors=FALSE)


   record_id voucher_number
1          1
2          1          00000
3          1          11111
4          1          22222
5          1          11111
6          2
7          2          33333
8          2          44444
9          2          33333
10         2          33333
11         3
12         3          55555
13         3          66666
14         3          55555
15         3          66666
16         3          55555
17         3          77777

我想编写一个函数,其中按 record_id 分组后创建一个新变量可以说是 Ice 。如果 voucher_number ,我希望 Ice 的值是 app 不见了。否则,如果 voucher_number 与单个<$相同,我想将 voucher_number 索引为1或2或3左右。 c $ c> record_id ,如果对于相同的记录ID,它是一个新的 voucher_number,并且没有重复,那么我希望将其称为1。

I want to write a function where after grouping by record_id I am creating a new variables lets say called Ice. I want the value of Ice to be app if voucher_number is missing. Otherwise I want to index voucher_number as 1 or 2 or 3 or so forth if voucher_number were the same for individual record_id and if its a new "voucher_number``` for the same record id and it was not repeated then I want it to be called as 1.

类似以下内容:

   record_id voucher_number ice
1          1           app1 app
2          1          00000   1
3          1          11111   1
4          1          22222   1
5          1          11111   2
6          2           app2 app
7          2          33333   1
8          2          44444   1
9          2          33333   2
10         2          33333   3
11         3           app3 app
12         3          55555   1
13         3          66666   1
14         3          55555   2
15         3          66666   2
16         3          55555   3
17         3          77777   1

最终我希望数据集由 record_id排序 voucher_number

and ultimately I want the dataset to be ordered by record_id and voucher_number.

非常感谢!

推荐答案

我们可以为 record_id voucher_number 的每个值创建一个行号和替换 ice 的值,其中 voucher_number app

We can create a row number for each value of record_id and voucher_number and replace ice value where voucher_number has "app" in it.

library(dplyr)

ds %>%
  group_by(record_id, voucher_number) %>%
  mutate(ice = row_number(),
         ice = replace(ice, grep('app', voucher_number), 'app'))

#   record_id voucher_number ice
#       <dbl> <chr>          <chr>
# 1         1 app1           app
# 2         1 00000          1
# 3         1 11111          1
# 4         1 22222          1
# 5         1 11111          2
# 6         2 app2           app
# 7         2 33333          1
# 8         2 44444          1
# 9         2 33333          2
#10         2 33333          3
#11         3 app            app
#12         3 55555          1
#13         3 66666          1
#14         3 55555          2
#15         3 66666          2
#16         3 55555          3
#17         3 77777          1

这篇关于R中的索引功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-14 20:20