在工作日重新订购Pandas系列

在工作日重新订购Pandas系列

本文介绍了在工作日重新订购Pandas系列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用熊猫,我已经提取了一个CSV文件,然后创建了一系列数据来找出一周中哪几天崩溃次数最多:

Using Pandas, I have pulled in a CSV file and then created a series of the data to find out which days of the week have the most crashes:

crashes_by_day = bc['DAY_OF_WEEK'].value_counts()

然后我将其绘制出来,但是当然它按照与该系列相同的排序顺序来绘制它们.

I have then plotted this out, but of course it plots them in the same ranked order as the series.

crashes_by_day.plot(kind='bar')

将它们重新排序为星期一,星期二,星期三,星期四,星期四,星期五,星期六,星期日的最有效方法是什么?

What is the most efficient way to re-rank these to Mon, Tue, Wed, Thur, Fri, Sat, Sun?

我必须将其分成一个列表吗?谢谢.

Do I have to break it out into a list? Thanks.

推荐答案

您可以使用 Ordered Categorical ,然后 sort_index :

You can use Ordered Categorical and then sort_index:

print bc
   DAY_OF_WEEK    a    b
0       Sunday  0.7  0.5
1       Monday  0.4  0.1
2      Tuesday  0.3  0.2
3    Wednesday  0.4  0.1
4     Thursday  0.3  0.6
5       Friday  0.4  0.9
6     Saturday  0.3  0.2
7       Sunday  0.7  0.5
8       Monday  0.4  0.1
9      Tuesday  0.3  0.2
10   Wednesday  0.4  0.1
11    Thursday  0.3  0.6
12      Friday  0.4  0.9
13    Saturday  0.3  0.2
14      Sunday  0.7  0.5
15      Monday  0.4  0.1
16     Tuesday  0.3  0.2
17   Wednesday  0.4  0.1
18    Thursday  0.3  0.6
19      Friday  0.4  0.9
20    Saturday  0.3  0.2
bc['DAY_OF_WEEK'] = pd.Categorical(bc['DAY_OF_WEEK'], categories=
    ['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday', 'Sunday'],
    ordered=True)

print bc['DAY_OF_WEEK']
0        Sunday
1        Monday
2       Tuesday
3     Wednesday
4      Thursday
5        Friday
6      Saturday
7        Sunday
8        Monday
9       Tuesday
10    Wednesday
11     Thursday
12       Friday
13     Saturday
14       Sunday
15       Monday
16      Tuesday
17    Wednesday
18     Thursday
19       Friday
20     Saturday
Name: DAY_OF_WEEK, dtype: category
Categories (7, object): [Monday < Tuesday < Wednesday < Thursday < Friday < Saturday < Sunday]
crashes_by_day = bc['DAY_OF_WEEK'].value_counts()
crashes_by_day = crashes_by_day.sort_index()
print crashes_by_day
Monday       3
Tuesday      3
Wednesday    3
Thursday     3
Friday       3
Saturday     3
Sunday       3
dtype: int64

crashes_by_day.plot(kind='bar')

不使用Categorical的下一个可能的解决方案是通过映射设置排序

Next possible solution without Categorical is set sorting by mapping:

crashes_by_day = bc['DAY_OF_WEEK'].value_counts().reset_index()
crashes_by_day.columns = ['DAY_OF_WEEK', 'count']
print crashes_by_day
  DAY_OF_WEEK  count
0    Thursday      3
1   Wednesday      3
2      Friday      3
3     Tuesday      3
4      Monday      3
5    Saturday      3
6      Sunday      3

days = ['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday', 'Sunday']
mapping = {day: i for i, day in enumerate(days)}
key = crashes_by_day['DAY_OF_WEEK'].map(mapping)
print key
0    3
1    2
2    4
3    1
4    0
5    5
6    6
Name: DAY_OF_WEEK, dtype: int64

crashes_by_day = crashes_by_day.iloc[key.argsort()].set_index('DAY_OF_WEEK')
print crashes_by_day
             count
DAY_OF_WEEK
Monday           3
Tuesday          3
Wednesday        3
Thursday         3
Friday           3
Saturday         3
Sunday           3

crashes_by_day.plot(kind='bar')

这篇关于在工作日重新订购Pandas系列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-11 13:39