本文介绍了使用gtools :: mixedsort或dplyr :: arrange的替代项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图通过使用 dplyr :: arrange 来订购数据框。问题是我要排序的列既包含固定字符串,又包含数字,例如下面的伪代码生成的。

I am trying to order a dataframe by making use of dplyr::arrange. The issue is that the column I am trying to sort on contains both a fixed string followed by a number, as for instance generated by the dummycode below.

  dummydf<-data.frame(values=rnorm(100),sortcol=paste0("ABC",sample(1:100,100,replace=FALSE)))

默认情况下,使用 dummydf%>%ranging(sortcol)会生成df按字母数字(?)排序,但这当然不是理想的结果:

By default, using dummydf %>% arrange(sortcol) would generate a df which is sorted alphanumerically (?) but this is of course not the desired result:

values sortcol
0.708081720    ABC1
0.041348322   ABC10
1.730962886  ABC100
0.423480861   ABC11
-1.545837266   ABC12
-1.345539947   ABC13
-0.078998792   ABC14
0.088712174   ABC15
0.670583024   ABC16
1.238837680   ABC17
-1.459044293   ABC18
-2.028535223   ABC19
0.779514385    ABC2
1.360509910   ABC20

在此示例中,我想按 gtools :: mixedsort 进行排序,请确保ABC2跟随ABC1且不被ABC1-19取代,并且ABC100 mixedsort(as.character(dummydf $ sortcol))可以做到这一点。

In this example, I would like to sort the column as gtools::mixedsort would do, making sure ABC2 follows ABC1 and is not preceed by ABC1-19 and ABC100 mixedsort(as.character(dummydf$sortcol)) would do that trick.

现在,我知道可以通过在 arrange sub 来做到这一点/ code>参数: dummydf%>%range(as.numeric(sub( ABC,,sortcol)))但这主要是因为我的字符串是固定的(尽管可以使用任何正则表达式来捕获我假设的任何字符串之后的最后一位数字)。

Now, I am aware I could do this by using sub in my arrange argument: dummydf %>% arrange(as.numeric(sub("ABC","",sortcol))) but that is mainly because my string is something fixed (although any regex could be used to capture the last digits following any string I suppose).

我只是想知道:是否还有一种更优雅且通用的方式通过 dplyr :: arrange ,以与 gtools :: mixedsort ?相同的方式?

I am just wondering: is there a more "elegant" and generic way to get this done with dplyr::arrange, in the same fashion as gtools::mixedsort?

亲切的问候,

FM

推荐答案

以下是使用身份 order(order(x))== rank( x)

mixedrank = function(x) order(gtools::mixedorder(x))
dummydf %>% dplyr::arrange(mixedrank(sortcol))

这篇关于使用gtools :: mixedsort或dplyr :: arrange的替代项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-29 02:37