问题描述
我有以下类型的数据:
Person <- c("A", "B", "C", "AB", "BC", "AC", "D", "E")
Father <- c(NA, NA, NA, "A", "B", "C", NA, "D")
Mother <- c(NA, NA, NA, "B", "C", "A", "C", NA)
var1 <- c( 1, 2, 3, 4, 2, 1, 6, 9)
var2 <- c(1.4, 2.3, 4.3, 3.4, 4.2, 6.1, 2.6, 8.2)
myd <- data.frame (Person, Father, Mother, var1, var2)
Person Father Mother var1 var2
1 A <NA> <NA> 1 1.4
2 B <NA> <NA> 2 2.3
3 C <NA> <NA> 3 4.3
4 AB A B 4 3.4
5 BC B C 2 4.2
6 AC C A 1 6.1
7 D <NA> C 6 2.6
8 E D <NA> 9 8.2
这是为了丢失(未知)。我想重新组织数据到三人组(个人及其父母母亲)。例如,AB个人的三人组将包括来自其父亲A和母亲B的数据。
Here is for missing (unknown). I want re-organize data in to trio (an Individual and its Father and Mother). For example trio for AB individual will include data from from its father A and mother B.
Person Father Mother var1 var2
1 A <NA> <NA> 1 1.4
2 B <NA> <NA> 2 2.3
4 AB A B 4 3.4
A,B,C不能做三重奏他们没有父母E表示只有一个父亲父亲是D。在这种情况下,三人中只有两个成员。
A, B, C can not make trio as they do not have parents. Somecases as E has only one parent father known that is D. In this case there will just two members in the trio.
7 D <NA> C 6 2.6
3 C <NA> <NA> 3 4.3
如果母亲和父亲在两个三人组中重复,则相同的值将被回收。
In case where mother and fathers are repeated in two trios the same value will be recycled.
因此,预期的完整输出将是:
Thus expected complete output would be:
Person Father Mother var1 var2 Trio
1 A <NA> <NA> 1 1.4 1
2 B <NA> <NA> 2 2.3 1
4 AB A B 4 3.4 1
2 B <NA> <NA> 2 2.3 2
3 C <NA> <NA> 3 4.3 2
5 BC B C 2 4.2 2
1 A <NA> <NA> 1 1.4 3
3 C <NA> <NA> 3 4.3 3
6 AC C A 1 6.1 3
NA <NA> <NA> <NA> NA NA 4
3 C <NA> <NA> 3 4.3 4
7 D <NA> C 6 2.6 4
NA <NA> <NA> <NA> NA NA 5
7 D <NA> C 6 2.6 5
8 E D <NA> 9 8.2 5
推荐答案
p>
This maybe roughly what you want
Person <- c("A", "B", "C", "AB", "BC", "AC", "D", "E")
Father <- c(NA, NA, NA, "A", "B", "C", NA, "D")
Mother <- c(NA, NA, NA, "B", "C", "A", "C", NA)
var1 <- c( 1, 2, 3, 4, 2, 1, 6, 9)
var2 <- c(1.4, 2.3, 4.3, 3.4, 4.2, 6.1, 2.6, 8.2)
myd <- data.frame (Person, Father, Mother, var1, var2,stringsAsFactors=F)
使用注意myd的定义略有变化stringAsFactors = F
parentage<-function(x,myd){
y<-myd[x,]
p1<-as.character(y['Father'])
p2<-as.character(y['Mother'])
out<-y
if(!is.na(p1)){
out<-rbind(out,myd[myd$Person==p1,])
}
if(!is.na(p2)){
out<-rbind(out,myd[myd$Person==p2,])
}
out$Trio=x
out
}
ans<-lapply(seq_along(myd$Person),parentage,myd)
> ans
[[1]]
Person Father Mother var1 var2 Trio
1 A <NA> <NA> 1 1.4 1
[[2]]
Person Father Mother var1 var2 Trio
2 B <NA> <NA> 2 2.3 2
[[3]]
Person Father Mother var1 var2 Trio
3 C <NA> <NA> 3 4.3 3
[[4]]
Person Father Mother var1 var2 Trio
4 AB A B 4 3.4 4
2 A <NA> <NA> 1 1.4 4
21 B <NA> <NA> 2 2.3 4
[[5]]
Person Father Mother var1 var2 Trio
5 BC B C 2 4.2 5
2 B <NA> <NA> 2 2.3 5
3 C <NA> <NA> 3 4.3 5
[[6]]
Person Father Mother var1 var2 Trio
6 AC C A 1 6.1 6
3 C <NA> <NA> 3 4.3 6
31 A <NA> <NA> 1 1.4 6
[[7]]
Person Father Mother var1 var2 Trio
7 D <NA> C 6 2.6 7
3 C <NA> <NA> 3 4.3 7
[[8]]
Person Father Mother var1 var2 Trio
8 E D <NA> 9 8.2 8
7 D <NA> C 6 2.6 8
如果你想拥有一个数据框,你可以使用 plyr
包
if you want to have a dataframe you can use the plyr
package
library(plyr)
ans<-adply(seq_along(myd$Person),1,parentage,myd)
这篇关于r中的数据重组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!