groupby是否保证稳定

groupby是否保证稳定

本文介绍了pandas.DataFrame.groupby是否保证稳定?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我注意到后跟一个 apply 隐式假设 groupby 是 - 也就是说,如果 a b 是同一组的实例,并且预分组 出现在 b 之前,那么在分组之后, a 也会出现在 b 之前。

我认为有几个答案明确隐含地使用这个,但具体来说,这里是。

有没有真正有希望这种行为的东西?该文档仅说明:b
$ b

另外,拥有索引的熊猫,理论上也可以在没有这种保证的情况下实现功能(尽管更多尽管这些文档没有在内部陈述它,但它在生成组时使用了稳定的排序方式。



请参阅:






正如我在评论中提到的那样,如果考虑 transform ,它将返回一个Series,其索引与原始df对齐。如果排序不保留订单,那么这将使对齐执行额外的工作,因为在分配之前需要对Series进行排序。实际上,中提到:


I've noticed that there are several uses of pd.DataFrame.groupby followed by an apply implicitly assuming that groupby is stable - that is, if a and b are instances of the same group, and pre-grouping, a appeared before b, then a will appear pre b following the grouping as well.

I think there are several answers clearly implicitly using this, but, to be concrete, here is one using groupby+cumsum.

Is there anything actually promising this behavior? The documentation only states:

Also, pandas having indices, the functionality could be theoretically be achieved also without this guarantee (albeit in a more cumbersome way).

解决方案

Although the docs don't state this internally, it uses stable sort when generating the groups.

See:

As I mentioned in the comments, this is important if you consider transform which will return a Series with it's index aligned to the original df. If the sorting didn't preserve the order, then this would make alignment perform additional work as it would need to sort the Series prior to assigning. In fact, this is mentioned in the comments:

这篇关于pandas.DataFrame.groupby是否保证稳定?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-11 19:44