我有一个数据集(dataset1)如下所示:
Date Company Weekday
2015-01-01 Company1 Monday
2015-01-02 Company1 Tuesday
2015-01-03 Company1 Wednesday
2015-01-04 Company1 Thursday
2015-12-09 Company2 Monday
2015-12-10 Company2 Tuesday
………………………………………………………………………
2016-01-08 Company3 Wednesday
2016-01-09 Company3 Thursday
然后,我应用以下代码:
dataset2 = dataset1.groupby(['Company','Weekday']).size().sort_values(ascending=False)
一旦应用了以上代码,我将得到以下结果:
Index 0
('Company1', Monday) 80
('Company1', Tuesday) 80
('Company1', Wednesday) 79
………………………………………………………………….
('Company3', Tuesday) 34
我试图隔离计数值大于50的所有dataset2条目,但是尝试以下操作时会遇到各种错误:
dataset2=dataset2.loc[dataset2[0]>50]
谁能提出意见?
最佳答案
使用Series
,因此需要:
dataset2 = dataset1.groupby(['Company','Weekday']).size().sort_values(ascending=False)
dataset2 = dataset2[dataset2 > 50]
另一个解决方案是为
Series.reset_index
添加带有参数name
的DataFrame
,然后按列count
进行过滤:dataset2 = (dataset1.groupby(['Company','Weekday'])
.size()
.sort_values(ascending=False)
.reset_index(name='count'))
dataset2 = dataset2[dataset2['count'] > 50]
关于python - 使用 Pandas 按日期计数值的频率-第二部分,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/53387122/