我有一个名为mydf的数据框。 Sample列中有重复的样本。我想提取最大total_reads的唯一样本行,并获得result

mydf<-structure(list(Sample = c("AOGC-02-0188", "AOGC-02-0191", "AOGC-02-0191",
"AOGC-02-0191", "AOGC-02-0194", "AOGC-02-0194", "AOGC-02-0194"
), total_reads = c(27392583, 19206920, 34462563, 53669483, 24731988,
43419826, 68151814), Lane = c("4", "5", "4", "4;5", "5", "4",
"4;5")), .Names = c("Sample", "total_reads", "Lane"), row.names = c("166",
"169", "170", "171", "173", "174", "175"), class = "data.frame")

结果
  Sample        total_reads  Lane
 AOGC-02-0188    27392583    4
 AOGC-02-0191    53669483  4;5
 AOGC-02-0194    68151814  4;5

最佳答案

您可以先aggregate然后merge

merge(aggregate(total_reads ~ Sample, mydf, max), mydf)
#        Sample total_reads Lane
#1 AOGC-02-0188    27392583    4
#2 AOGC-02-0191    53669483  4;5
#3 AOGC-02-0194    68151814  4;5

09-25 22:21