本文介绍了在R中组合重复的行,并添加包含重复ID的新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 29岁程序员,3月因学历无情被辞! 我有一个数据框,如下所示: Chr start stop ref alt Hom / het ID chr1 5179574 5183384 ref Del Het 719 chr1 5179574 5184738 ref Del Het 915 chr1 5179574 5184738 ref Del Het 951 chr1 5336806 5358384 ref Del Het 376 chr1 5347979 5358384 ref Del Het 228 我想合并任何重复的行,组合最后一个ID列,以便所有的ID都在一行/列,如下所示: Chr start stop ref alt Hom / het ID chr1 5179574 5183384 ref Del Het 719 chr1 5179574 5184738 ref Del Het 915,951 chr1 5336806 5358384 ref Del Het 376 chr1 5347979 5358384 ref Del Het 228 pre> 我找到了删除重复项并对列进行求和的示例,但是我只想将列表中的所有ID与重复区域组合在一列中。 解决方案有些调用 aggregate() / p> 这是一个在列表对象中收集ID的选项: (df1 #Chr start stop ref alt Hom.het ID #1 chr1 5179574 5183384 ref Del Het 719 #2 chr1 5179574 5184738 ref Del Het 915,951 #3 chr1 5336806 5358384 ref Del Het 376 #4 chr1 5347979 5358384 ref Del Het 228 这里是以字符向量收集它们的: df2< - aggregate(df [7],df [-7], FUN = function(X)paste(unique(X),collapse =,)) 比较两个选项的结果: code> str(df1 $ ID)#列表4 #$ 0:int 719 #$ 3:int [1:2] 915 951 # $ 7:int 376 #$ 8:int 228 str(df2 $ ID)#chr [1:4]719915,951376228 I have a data frame that looks like this:Chr start stop ref alt Hom/het IDchr1 5179574 5183384 ref Del Het 719chr1 5179574 5184738 ref Del Het 915chr1 5179574 5184738 ref Del Het 951chr1 5336806 5358384 ref Del Het 376chr1 5347979 5358384 ref Del Het 228I would like to merge any duplicate rows, combining the last ID column so that all IDs are in one row/column, like this:Chr start stop ref alt Hom/het IDchr1 5179574 5183384 ref Del Het 719chr1 5179574 5184738 ref Del Het 915, 951chr1 5336806 5358384 ref Del Het 376chr1 5347979 5358384 ref Del Het 228I have found examples of people removing duplicates and summing a column, but I just want to combine all IDs with duplicate regions in a list in a single column. 解决方案 Some call to aggregate() should do the trick.Here's an option that collects the ID's in a list object:(df1 <- aggregate(df[7], df[-7], unique))# Chr start stop ref alt Hom.het ID# 1 chr1 5179574 5183384 ref Del Het 719# 2 chr1 5179574 5184738 ref Del Het 915, 951# 3 chr1 5336806 5358384 ref Del Het 376# 4 chr1 5347979 5358384 ref Del Het 228And here's one that collects them in a character vector: df2 <- aggregate(df[7], df[-7], FUN = function(X) paste(unique(X), collapse=", "))Comparing the results of the two options:str(df1$ID)# List of 4# $ 0: int 719# $ 3: int [1:2] 915 951# $ 7: int 376# $ 8: int 228str(df2$ID)# chr [1:4] "719" "915, 951" "376" "228" 这篇关于在R中组合重复的行,并添加包含重复ID的新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持! 上岸,阿里云!
08-21 01:31