问题描述
使用熊猫,我已经提取了一个CSV文件,然后创建了一系列数据来找出一周中哪几天崩溃次数最多:
Using Pandas, I have pulled in a CSV file and then created a series of the data to find out which days of the week have the most crashes:
crashes_by_day = bc['DAY_OF_WEEK'].value_counts()
然后我将其绘制出来,但是当然它按照与该系列相同的排序顺序来绘制它们.
I have then plotted this out, but of course it plots them in the same ranked order as the series.
crashes_by_day.plot(kind='bar')
将它们重新排序为星期一,星期二,星期三,星期四,星期四,星期五,星期六,星期日的最有效方法是什么?
What is the most efficient way to re-rank these to Mon, Tue, Wed, Thur, Fri, Sat, Sun?
我必须将其分成一个列表吗?谢谢.
Do I have to break it out into a list? Thanks.
推荐答案
您可以使用 Ordered Categorical
,然后 sort_index
:
You can use Ordered Categorical
and then sort_index
:
print bc
DAY_OF_WEEK a b
0 Sunday 0.7 0.5
1 Monday 0.4 0.1
2 Tuesday 0.3 0.2
3 Wednesday 0.4 0.1
4 Thursday 0.3 0.6
5 Friday 0.4 0.9
6 Saturday 0.3 0.2
7 Sunday 0.7 0.5
8 Monday 0.4 0.1
9 Tuesday 0.3 0.2
10 Wednesday 0.4 0.1
11 Thursday 0.3 0.6
12 Friday 0.4 0.9
13 Saturday 0.3 0.2
14 Sunday 0.7 0.5
15 Monday 0.4 0.1
16 Tuesday 0.3 0.2
17 Wednesday 0.4 0.1
18 Thursday 0.3 0.6
19 Friday 0.4 0.9
20 Saturday 0.3 0.2
bc['DAY_OF_WEEK'] = pd.Categorical(bc['DAY_OF_WEEK'], categories=
['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday', 'Sunday'],
ordered=True)
print bc['DAY_OF_WEEK']
0 Sunday
1 Monday
2 Tuesday
3 Wednesday
4 Thursday
5 Friday
6 Saturday
7 Sunday
8 Monday
9 Tuesday
10 Wednesday
11 Thursday
12 Friday
13 Saturday
14 Sunday
15 Monday
16 Tuesday
17 Wednesday
18 Thursday
19 Friday
20 Saturday
Name: DAY_OF_WEEK, dtype: category
Categories (7, object): [Monday < Tuesday < Wednesday < Thursday < Friday < Saturday < Sunday]
crashes_by_day = bc['DAY_OF_WEEK'].value_counts()
crashes_by_day = crashes_by_day.sort_index()
print crashes_by_day
Monday 3
Tuesday 3
Wednesday 3
Thursday 3
Friday 3
Saturday 3
Sunday 3
dtype: int64
crashes_by_day.plot(kind='bar')
不使用Categorical
的下一个可能的解决方案是通过映射设置排序
Next possible solution without Categorical
is set sorting by mapping:
crashes_by_day = bc['DAY_OF_WEEK'].value_counts().reset_index()
crashes_by_day.columns = ['DAY_OF_WEEK', 'count']
print crashes_by_day
DAY_OF_WEEK count
0 Thursday 3
1 Wednesday 3
2 Friday 3
3 Tuesday 3
4 Monday 3
5 Saturday 3
6 Sunday 3
days = ['Monday','Tuesday','Wednesday','Thursday','Friday','Saturday', 'Sunday']
mapping = {day: i for i, day in enumerate(days)}
key = crashes_by_day['DAY_OF_WEEK'].map(mapping)
print key
0 3
1 2
2 4
3 1
4 0
5 5
6 6
Name: DAY_OF_WEEK, dtype: int64
crashes_by_day = crashes_by_day.iloc[key.argsort()].set_index('DAY_OF_WEEK')
print crashes_by_day
count
DAY_OF_WEEK
Monday 3
Tuesday 3
Wednesday 3
Thursday 3
Friday 3
Saturday 3
Sunday 3
crashes_by_day.plot(kind='bar')
这篇关于在工作日重新订购Pandas系列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!