本文介绍了如何使用 pandas 季节性地分组数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个包含4年数据的csv文件,我需要在4年内对每个季节的数据进行分组:这里是我的数据:
i have a csv file containing 4 years of data, i need to group my data per season over the 4 years : here's a look of my data :
timestamp,heure,lat,lon,impact,type
2006-01-01 00:00:00,13:58:43,33.837,-9.205,10.3,1
2006-01-02 00:00:00,00:07:28,34.5293,-10.2384,17.7,1
2007-02-01 00:00:00,23:01:03,35.0617,-1.435,-17.1,2
2007-02-02 00:00:00,01:14:29,36.5685,0.9043,36.8,1
2008-01-01 00:00:00,05:03:51,34.1919,-12.5061,-48.9,1
2008-01-02 00:00:00,05:03:51,34.1919,-12.5061,-48.9,1
....
2011-12-31 00:00:00,05:03:51,34.1919,-12.5061,-48.9,1
以下是我所需的输出:
and here's my desired output :
winter (the mean value of impacts)
summer (the mean value of impacts)
autumn ....
spring .....
所以我期待4个季节总结所有月份的4行。
i开始如下:
so i am expecting 4 rows summarizing all month in 4 seasons .i started as below :
data['impact'] = data['impact'].abs()
yearly = data.groupby(data.index.month)['impact'].mean()
任何想法
推荐答案
确切日期
With exact dates
import pandas as pd
spring = range(80, 172)
summer = range(172, 264)
fall = range(264, 355)
def season(x):
if x in spring:
return 'Spring'
if x in summer:
return 'Summer'
if x in fall:
return 'Fall'
else :
return 'Winter'
df = pd.DataFrame({'_date' :pd.date_range(start=pd.datetime(2016,1,1), end=pd.datetime(2016,12,31), freq='D'),'impact' : range(0,366)})
df['SEASON'] = df['_date'].dt.dayofyear.apply(lambda x : season(x))
df.groupby('SEASON')['impact'].mean()
这篇关于如何使用 pandas 季节性地分组数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!