本文介绍了在Python中绘制直方图的时间序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在Python中绘制直方图的时间序列. 对此有类似的问题,但在R 中.因此,基本上,我需要同样的东西,但是我对R真的很不好.我的数据集中通常每天有48个值.其中-9999表示缺少数据. 这里是数据样本.

I'm trying to plot a time-series of histograms in Python. There has been a similar question about this, but in R. So, basically, I need the same thing, but I'm really bad in R. There are usually 48 values per day in my dataset. Where - 9999 represents missing data. Here's the sample of the data.

我首先读取数据并构造一个pandas DataFrame.

I started with reading in the data and constructing a pandas DataFrame.

import pandas as pd
df = pd.read_csv('sample.csv', parse_dates=True, index_col=0, na_values='-9999')
print df

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 336 entries, 2008-07-25 14:00:00 to 2008-08-01 13:30:00
Data columns (total 1 columns):
159.487691046    330  non-null values
dtypes: float64(1)

现在我可以按日期对数据进行分组:

Now I can group the data by day:

daily = df.groupby(lambda x: x.date())

但是后来我被困住了.我不知道如何在matplotlib中使用它来获取我的直方图时间序列.感谢任何帮助,不一定使用pandas.

But then I'm stuck. I don't know how to use this with matplotlib to get my timeseries of histograms. Any help appreciated, not necessarily using pandas.

推荐答案

制作直方图并使用matplotlib的pcolor.

Make a histogram and use matplotlib's pcolor.

我们需要统一对分组进行分类,因此我们将根据您的样本数据范围手动进行分类.

We need to bin the groups uniformly, so we make bins manually based on the range of your sample data.

In [26]: bins = np.linspace(0, 360, 10)

histogram应用于每个组.

In [27]: f = lambda x: Series(np.histogram(x, bins=bins)[0], index=bins[:-1])

In [28]: df1 = daily.apply(f)

In [29]: df1
Out[29]:
            0    40   80   120  160  200  240  280  320
2008-07-25    0    0    0    3   18    0    0    0    0
2008-07-26    2    0    0    0   17    6   13    1    8
2008-07-27    4    3   10    0    0    0    0    0   31
2008-07-28    0    7   15    0    0    0    0    6   20
2008-07-29    0    0    0    0    0    0   20   26    0
2008-07-30   10    1    0    0    0    0    1   25    9
2008-07-31   30    4    1    0    0    0    0    0   12
2008-08-01    0    0    0    0    0    0    0   14   14

在R中链接的示例之后,水平轴应为日期,垂直轴应为bin的范围.直方图的值是一个热图".

Following your linked example in R, the horizontal axis should be dates, and the vertical axis should be the range of bins. The histogram values are a "heat map."

In [30]: pcolor(df1.T)
Out[30]: <matplotlib.collections.PolyCollection at 0xbb60e2c>

仍然可以标记轴. 此答案应该会有所帮助.

It remains to label the axes. This answer should be of some help.

这篇关于在Python中绘制直方图的时间序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-29 05:29