本文介绍了查找所有重复的行,包括“具有较小下标的元素".的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

R的duplicated返回一个向量,该向量表示向量或数据帧的每个元素是否都是下标较小的元素的重复.因此,如果5行数据帧的第3、4和5行相同,则duplicated将给我向量

R's duplicated returns a vector showing whether each element of a vector or data frame is a duplicate of an element with a smaller subscript. So if rows 3, 4, and 5 of a 5-row data frame are the same, duplicated will give me the vector

FALSE, FALSE, FALSE, TRUE, TRUE

但是在这种情况下,我实际上是想得到

But in this case I actually want to get

FALSE, FALSE, TRUE, TRUE, TRUE

也就是说,我想知道行是否也被带有更大下标的行重复.

that is, I want to know whether a row is duplicated by a row with a larger subscript too.

推荐答案

duplicated具有fromLast参数. ?duplicated的示例"部分显示了如何使用它.只需调用duplicated两次,一次使用fromLast=FALSE,一次使用fromLast=TRUE,然后选择其中任一行为TRUE.

duplicated has a fromLast argument. The "Example" section of ?duplicated shows you how to use it. Just call duplicated twice, once with fromLast=FALSE and once with fromLast=TRUE and take the rows where either are TRUE.

有些晚您没有提供可复制的示例,因此以下示例由@jbaums贡献

Some late You didn't provide a reproducible example, so here's an illustration kindly contributed by @jbaums

vec <- c("a", "b", "c","c","c") 
vec[duplicated(vec) | duplicated(vec, fromLast=TRUE)]
## [1] "c" "c" "c"


还有一个数据框情况的示例:


And an example for the case of a data frame:

df <- data.frame(rbind(c("a","a"),c("b","b"),c("c","c"),c("c","c")))
df[duplicated(df) | duplicated(df, fromLast=TRUE), ]
##   X1 X2
## 3  c  c
## 4  c  c

这篇关于查找所有重复的行,包括“具有较小下标的元素".的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-29 19:28