问题描述
我有以下一列:
column
0 10
1 10
2 8
3 8
4 6
5 6
我的目标是找到今天唯一的值(在本例中为3)并创建一个新列,该列将创建以下
My goal is to find the today unique values (3 in this case) and create a new column which would create the following
new_column
0 3
1 3
2 2
3 2
4 1
5 1
编号从唯一值的长度(3)开始,如果当前行与基于原始列的上一行相同,则重复相同的编号。随着行值的更改,数量减少。原始列中的所有唯一值都具有相同的行数(在这种情况下,每个唯一值2行)。
The numbering starts from length of unique values (3) and same number is repeated if current row is same as previous row based on original column. Number gets decreased as row value changes. All unique values in original column have same number of rows (2 rows for each unique value in this case).
我的解决方案是对原始列进行分组,然后创建一个新列表,如下所示:
My solution was to groupby the original column and create a new list like below:
i=1
new_time=[]
for j, v in df.groupby('column'):
new_time.append([i]*2)
i=i+1
然后我按降序整理列表。还有其他更简单的解决方案吗?
Then I'd flatten the list sort in decreasing order. Any other simpler solution?
谢谢。
推荐答案
使用 GroupBy.ngroup
,其中 ascending = False
:
df.groupby('column', sort=False).ngroup(ascending=False)+1
0 3
1 3
2 2
3 2
4 1
5 1
dtype: int64
对于看起来像这样的DataFrame,
For DataFrame that looks like this,
df = pd.DataFrame({'column': [10, 10, 8, 8, 10, 10]})
。 。 。如果仅将连续值分组,则需要修改您的分组器:
. . .where only consecutive values are to be grouped, you'll need to modify your grouper:
(df.groupby(df['column'].ne(df['column'].shift()).cumsum(), sort=False)
.ngroup(ascending=False)
.add(1))
0 3
1 3
2 2
3 2
4 1
5 1
dtype: int64
这篇关于 pandas 按降序枚举组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!