我有一个名为mydf
的数据框。 Sample
列中有重复的样本。我想提取最大total_reads
的唯一样本行,并获得result
。
mydf<-structure(list(Sample = c("AOGC-02-0188", "AOGC-02-0191", "AOGC-02-0191",
"AOGC-02-0191", "AOGC-02-0194", "AOGC-02-0194", "AOGC-02-0194"
), total_reads = c(27392583, 19206920, 34462563, 53669483, 24731988,
43419826, 68151814), Lane = c("4", "5", "4", "4;5", "5", "4",
"4;5")), .Names = c("Sample", "total_reads", "Lane"), row.names = c("166",
"169", "170", "171", "173", "174", "175"), class = "data.frame")
结果
Sample total_reads Lane
AOGC-02-0188 27392583 4
AOGC-02-0191 53669483 4;5
AOGC-02-0194 68151814 4;5
最佳答案
您可以先aggregate
然后merge
,
merge(aggregate(total_reads ~ Sample, mydf, max), mydf)
# Sample total_reads Lane
#1 AOGC-02-0188 27392583 4
#2 AOGC-02-0191 53669483 4;5
#3 AOGC-02-0194 68151814 4;5