我正在处理人口普查数据。数据集如下所示:

Household-Id    Member-Type    Education    Birth
1               Father         12           1955
1               Mother         16           1963
1               Child          16           1986
1               Child          12           1995
2               Father         12           1950
2               Mother         9            1955
2               Child          18           1982
2               Child          14           1985
2               Child          16           1975
3               Father         16           1962
3               Mother         14           1965
3               Child          16           1990


我希望它看起来像是:

Household-Id    Member-Type    Education    Birth    Mother-Education    Birth-Order
1               Father         12           1955
1               Mother         16           1963
1               Child          16           1986     16                  1
1               Child          12           1995     16                  2
2               Father         12           1950
2               Mother         9            1955
2               Child          18           1982     9                   1
2               Child          14           1985     9                   2
2               Child          16           1975     9                   3
3               Father         16           1962
3               Mother         14           1965
3               Child          16           1990     14                  1


据我所知,R不像Java或C这样的语言中支持循环操作,而且我对如何执行此操作实际上一无所知!

最佳答案

这是一种dplyr方法:

library(dplyr)

dat = dat %>% group_by(Household.Id, Member.Type) %>%
  arrange(Birth) %>%
  mutate(Birth_Order = 1:n(),
         Birth_Order = ifelse(Member.Type=="Child", Birth_Order, NA_integer_)) %>%
  group_by(Household.Id) %>%
  mutate(Mother_Education = ifelse(Member.Type=="Child",
                                   Education[Member.Type=="Mother"], NA))

   Household.Id Member.Type Education Birth Birth_Order Mother_Education
1             1       Child        16  1986           1               16
2             1       Child        12  1995           2               16
3             1      Father        12  1955          NA               NA
4             1      Mother        16  1963          NA               NA
5             2       Child        16  1975           1                9
6             2       Child        18  1982           2                9
7             2       Child        14  1985           3                9
8             2      Father        12  1950          NA               NA
9             2      Mother         9  1955          NA               NA
10            3       Child        16  1990           1               14
11            3      Father        16  1962          NA               NA
12            3      Mother        14  1965          NA               NA

关于r - 使用R中具有相同Id(键)列值的多行,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/32341424/

10-12 16:49