本文介绍了在 pandas 数据框中使用groupy和subplots的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框对象,包含多个列中的时间序列数据(请参见下文)。我正在为数据框中的每个列创建一个图形,其中每个子图都有12个boxlot,每个月有一个boxlot。



我之前使用下面的代码来 制作一个数据框的子图(但对于不是箱形图的图块),

  labels = df.columns.values 
fig,axes = plt.subplots(nrows = 3,ncols = 4,gridspec_kw = dict(hspace = 0.3),figsize =(12,9),sharex = True,sharey = True)
targets = zip(labels,axes.flatten())
for i,(col,ax)in enumerate (目标):
pd.DataFrame(df [col])。plot(kind ='bar',ax = ax,color ='green')

但是当我使用groupby对象代替数据框时,它不起作用。

  groupped = df.groupby(df.index.month)
labels = df.columns.values
fig,axes = plt.subplots(nrows = 3,ncols = 4)
targets = zip(labels,axes.flatten())
在枚举(目标)中为i,(col,ax):
分组[col] .boxplot(ax = ax,color = 'green',subplots = False)

问题在于Boxplot无法在'SeriesG roupBy'



但即使我使用
df.plot.box(by = df.index.month
df.boxplot(by = df.index.month)
(代替单独分组对象,首先)分组似乎没有得到承认。



有没有人有建议?
谢谢!



编辑
示例数据:

  res01 res02 res03 res04 res05 res06 
1981-01-31 -16.571927 -4.051575 -8.865433 -0.858423 41.831455 -14.569453
1981-02-28 -14.672908 -2.004894 -6.151469 -0.448101 -30.476155 -13.572198
1981- 03-31 -10.588504 -1.079251 -3.057215 -0.897639 -19.407469 -6.936018
1981-04-30 -18.132814 -1.438858 0.028866 0.388591 -24.435158 -8.880159
1981-05-31 -8.190266 -2.175105 -4.326701 - 1.089722 -13.286928 -13.530322
1981-06-30 -7.857190 -2.861348 -5.046409 -0.013585 -17.134277 -18.153491
1981-07-31 -0.882391 -4.497572 -9.914211 -1.115400 -27.628329 -33.412025
1981-08-31 12.876021 -4.969259 -11.849937 -1.205588 -29.825922 -36.093600
1981-09-30 -43.434015 -8.681070 -14.143496 -4.701924 -32.357578 -25.945754
1981-10-31 38.656449 3.055204 3.088694 1.425666 12.881002 -7.261655
1981-11- 30 -3.455937 -2.136963 -4.393510 0.472263 10.560834 -11.224297
1981-12-31 -2.923868 -2.006733 -1.667986 -0.460742 -8.663085 -12.022059
1982-01-31 19.625548 -2.127550 -4.044511 -0.447382 27.524403 - 8.551865
1982-02-28 -12.424200 -1.931246 -6.055349 -0.448398 -29.979264 -13.166926
1982-03-31 35.249772 -2.416680 -6.029210 -0.661215 -47.206552 -24.267880
1982-04- 30 -55.008877 -7.160744 -9.331341 -1.040474 -42.029073 -32.618620
1982-05-31 -17.349030 -3.067463 -6.511664 -0.892260 -40.803273 -29.355429
1982-06-30 -5.710025 -2.519162 -15.885825 - 1.664557 -36.476341 -43.840351
1982-07-31 -30.790685 -8.042895 -12.381517 -1.339010 -38.542642 -53.612233
1982-08-31 4.263036 1.270455 -13.225027 -1.431894 -29.160338 -36.575128
1982 -09-30 -17.206044 -14.336086 -13.276423 -1.316164 -32.316961 -43.796818
1982-10-31 -5.164960 -6.247522 -12.369959 -1.045498 12.716187 -29.489328
1982-11-30 -25.543948 -2.648465 -5.598642 -0.554379 12.033847 -12.507718
1982-12-31 -2.971802 -1.982072 -1.225803 -0.335575 -7.452425 -10.182204
1983-01-31 29.917477 -3.224031 - 7.680435 -0.701457 43.068696 -11.812835
1983-02-28 4.998955 -3.281333 -12.630952 -0.867328 -47.758882 -30.902821
1983-03-31 -21.483914 -3.219957 -7.321552 -0.756839 -50.798885 -29.858194
1983-04-30 -23.288018 -2.411159 -5.212307 -0.626141 -49.477692 -22.813129
1983-05-31 0.317828 -3.181573 -6.915676 -0.855810 -21.701865 -23.165239
1983-06-30 -23.914567 - 7.788987 -18.696691 -2.082176 -35.968441 -50.015002
1983-07-31 -21.452370 -6.447321 -14.399266 -1.514856 -35.645412 -49.081801
1983-08-31 -14.721837 -7.266818 -14.439923 -1.499819 -47.237557 - 52.978016
1983-09-30 -18.532760 -3.905781 -7.398113 -0.729630 -16.512127 -23.390976
1983-10-31 62.864704 -5.903833 -13.910222 -1.143347 21.336868 -26.468803
1983-11-30 -11.050188 -5.180171 -12.654286 -1.186503 24.885744 -22.581720
1983-12-31 -9.576725 -6.114298 -7.761357 -1.048323 -23.590444 -37.646843

code> .groups 这将返回一个带有组键和每个键的相应索引的字典。因此,如果你只是想绘制12个子图



IIUC,你可以尝试这样做(使用seaborn模块):

  ax = sns.boxplot(data = df,x = df.index.month,y ='res01')
pre>



subplots:

  labels = df.columns.values 
fig,axes = plt.subplots(nrows = 3,ncols = 4,gridspec_kw = dict hspace = 0.3),figsize =(12,9),sharex = True,sharey = True)
targets = zip(labels,axes.flatten())
for i,(col,ax)in枚举(目标):
sns.boxplot(data = df,ax = ax,color ='green',x = df.index.month,y = col)


PS我不是小号尽管如此,我正确理解你的目标


I have a dataframe object with time series data in multiple columns (see below). I am trying to make a graphic with subplots for each of the columns in the dataframe where each subplot has 12 boxplots, one for each month.

I have used the following code to just make subplots from a dataframe before (but for bar not boxplots),

labels = df.columns.values
fig, axes = plt.subplots(nrows = 3, ncols = 4, gridspec_kw =  dict(hspace=0.3),figsize=(12,9), sharex = True, sharey=True)
targets = zip(labels, axes.flatten())
for i, (col,ax) in enumerate(targets):
    pd.DataFrame(df[col]).plot(kind='bar', ax=ax, color = 'green')

but it does not work as is when I use the groupby object in place of dataframe

grouped = df.groupby(df.index.month)
labels = df.columns.values
fig, axes = plt.subplots(nrows = 3, ncols = 4)
targets = zip(labels, axes.flatten())
for i, (col,ax) in enumerate(targets):
    grouped[col].boxplot(ax=ax, color = 'green', subplots =False)

The problem is that boxplot cannot be called on a 'SeriesGroupBy'

But even if I use df.plot.box(by = df.index.month or df.boxplot(by = df.index.month) directly in the plot loop (in place of making the grouped object separately, first) the grouping doesn't seem to be recognized.

Does any one have suggestions? Thanks!

EDIT Example data:

               res01      res02      res03     res04      res05      res06
1981-01-31 -16.571927  -4.051575  -8.865433 -0.858423  41.831455 -14.569453   
1981-02-28 -14.672908  -2.004894  -6.151469 -0.448101 -30.476155 -13.572198   
1981-03-31 -10.588504  -1.079251  -3.057215 -0.897639 -19.407469  -6.936018   
1981-04-30 -18.132814  -1.438858   0.028866  0.388591 -24.435158  -8.880159   
1981-05-31  -8.190266  -2.175105  -4.326701 -1.089722 -13.286928 -13.530322   
1981-06-30  -7.857190  -2.861348  -5.046409 -0.013585 -17.134277 -18.153491   
1981-07-31  -0.882391  -4.497572  -9.914211 -1.115400 -27.628329 -33.412025   
1981-08-31  12.876021  -4.969259 -11.849937 -1.205588 -29.825922 -36.093600   
1981-09-30 -43.434015  -8.681070 -14.143496 -4.701924 -32.357578 -25.945754   
1981-10-31  38.656449   3.055204   3.088694  1.425666  12.881002  -7.261655   
1981-11-30  -3.455937  -2.136963  -4.393510  0.472263  10.560834 -11.224297   
1981-12-31  -2.923868  -2.006733  -1.667986 -0.460742  -8.663085 -12.022059   
1982-01-31  19.625548  -2.127550  -4.044511 -0.447382  27.524403  -8.551865   
1982-02-28 -12.424200  -1.931246  -6.055349 -0.448398 -29.979264 -13.166926   
1982-03-31  35.249772  -2.416680  -6.029210 -0.661215 -47.206552 -24.267880   
1982-04-30 -55.008877  -7.160744  -9.331341 -1.040474 -42.029073 -32.618620   
1982-05-31 -17.349030  -3.067463  -6.511664 -0.892260 -40.803273 -29.355429   
1982-06-30  -5.710025  -2.519162 -15.885825 -1.664557 -36.476341 -43.840351   
1982-07-31 -30.790685  -8.042895 -12.381517 -1.339010 -38.542642 -53.612233   
1982-08-31   4.263036   1.270455 -13.225027 -1.431894 -29.160338 -36.575128   
1982-09-30 -17.206044 -14.336086 -13.276423 -1.316164 -32.316961 -43.796818   
1982-10-31  -5.164960  -6.247522 -12.369959 -1.045498  12.716187 -29.489328   
1982-11-30 -25.543948  -2.648465  -5.598642 -0.554379  12.033847 -12.507718   
1982-12-31  -2.971802  -1.982072  -1.225803 -0.335575  -7.452425 -10.182204   
1983-01-31  29.917477  -3.224031  -7.680435 -0.701457  43.068696 -11.812835   
1983-02-28   4.998955  -3.281333 -12.630952 -0.867328 -47.758882 -30.902821   
1983-03-31 -21.483914  -3.219957  -7.321552 -0.756839 -50.798885 -29.858194   
1983-04-30 -23.288018  -2.411159  -5.212307 -0.626141 -49.477692 -22.813129   
1983-05-31   0.317828  -3.181573  -6.915676 -0.855810 -21.701865 -23.165239   
1983-06-30 -23.914567  -7.788987 -18.696691 -2.082176 -35.968441 -50.015002   
1983-07-31 -21.452370  -6.447321 -14.399266 -1.514856 -35.645412 -49.081801   
1983-08-31 -14.721837  -7.266818 -14.439923 -1.499819 -47.237557 -52.978016   
1983-09-30 -18.532760  -3.905781  -7.398113 -0.729630 -16.512127 -23.390976   
1983-10-31  62.864704  -5.903833 -13.910222 -1.143347  21.336868 -26.468803   
1983-11-30 -11.050188  -5.180171 -12.654286 -1.186503  24.885744 -22.581720   
1983-12-31  -9.576725  -6.114298  -7.761357 -1.048323 -23.590444 -37.646843   
解决方案

AFAIK, if you group your DF you have either apply some aggregate (reducing) function or call .groups which would return you a dict with group keys and corresponding indexes for each key. So if you just want to plot 12 subplots

IIUC you may try to do it this way (using seaborn module):

ax = sns.boxplot(data=df, x=df.index.month, y='res01')

subplots:

labels = df.columns.values
fig, axes = plt.subplots(nrows = 3, ncols = 4, gridspec_kw =  dict(hspace=0.3),figsize=(12,9), sharex = True, sharey=True)
targets = zip(labels, axes.flatten())
for i, (col,ax) in enumerate(targets):
    sns.boxplot(data=df, ax=ax, color='green', x=df.index.month, y=col)

PS i'm not sure though that i correctly understood your goal

这篇关于在 pandas 数据框中使用groupy和subplots的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-31 00:11