本文介绍了在dplyr窗口函数中使用多列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 29岁程序员,3月因学历无情被辞! 来自SQL,我希望我能够在dplyr中执行以下操作,这可能吗?Comming from SQL i would expect i was able to do something like the following in dplyr, is this possible?# Rtbl %>% mutate(n = dense_rank(Name, Email))-- SQLSELECT Name, Email, DENSE_RANK() OVER (ORDER BY Name, Email) AS n FROM tbl还有 PARTITION BY ?推荐答案我确实遇到了这个问题,这是我的解决方案:I did struggle with this problem and here is my solution:如果找不到支持多个变量排序的函数,建议您使用 paste()将它们按优先级从左到右连接。 / code>。In case you can't find any function which supports ordering by multiple variables, I suggest that you concatenate them by their priority level from left to right using paste().下面是代码示例:tbl %>% mutate(n = dense_rank(paste(Name, Email))) %>% arrange(Name, Email) %>% view()此外,我想group_by与SQL中的PARTITION BY等效。Moreover, I guess group_by is the equivalent for PARTITION BY in SQL.此解决方案的不足之处在于,您只能按2(或更多)具有相同方向的变量进行排序。如果您需要按方向不同的多个列进行排序,即1个asc和1个desc,建议您尝试以下操作: 基于多个变量的关系计算排名The shortfall for this solution is that you can only order by 2 (or more) variables which have the same direction. In the case that you need to order by multiple columns which have different direction, saying that 1 asc and 1 desc, I suggest you to try this:Calculate rank with ties based on more than one variable 这篇关于在dplyr窗口函数中使用多列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持! 上岸,阿里云!
09-02 17:29