使用.groupby()的 pandas 时间序列的平均值

您应该整理数据，以便nx和ny包含实际数字.将nx，ny(以及我认为是rx和ry)保存在一个单独的DataFrame中，这可能是最简单的，其中每一列对应一个id.Hi,I have some continuous x/y coordinates from a behavioural experiment, that I would like to average within groups using Pandas.I'm using a subset of the data here. dataOut[11]: <class 'pandas.core.frame.DataFrame'>Int64Index: 2036 entries, 0 to 1623Data columns (total 9 columns):id 2036 non-null valuessubject 2036 non-null valuescode 2036 non-null valuesacc 2036 non-null valuesnx 2036 non-null valuesny 2036 non-null valuesrx 2036 non-null valuesry 2036 non-null valuesreaction_time 2036 non-null valuesdtypes: bool(1), int64(3), object(5)nx and ny hold a series of TimeSeries objects, all of which have the same indices.data.nx.iloc[0]Out[16]: 0 01 02 03 04 05 06 07 08 09 010 011 012 013 014 0...86 1.01990187 1.01000088 1.01000089 1.00592190 1.00000091 1.00000092 1.00000093 1.00000094 1.00000095 1.00000096 1.00000097 1.00000098 1.00000099 1.000000100 1.000000Length: 101, dtype: float64These TimeSeries columns can be average normally, using data.nx.mean(), and behave as expected, but I hit trouble when I try to group the data.grouped = data.groupby(['code', 'acc'])means = grouped.mean()print means id subject reaction_timecode acc group1 False 1570.866667 47474992.333333 1506.000000 True 1337.076152 46022403.623246 1322.116232group2 False 1338.180180 48730402.045045 1289.112613 True 1382.631757 42713592.628378 1294.952703group3 False 1488.587156 43202477.623853 1349.568807 True 1310.415233 47054310.498771 1341.837838group4 False 1339.682540 52530349.936508 1540.714286 True 1343.261176 44606616.407059 1362.174118Strangely, I can force them to average the TimeSeries data, and may have to fall back on hacking this way, like so:for name, group in grouped: print group.nx.mean()0 0.0000001 0.0000002 0.0000003 0.0000004 0.0000005 0.0006676 0.0006837 0.0019528 0.0020009 0.002000{etc, 101 values for 6 groups}Finally, if I try to force the GroupBy object to average them, I get the following:grouped.nx.mean()---------------------------------------------------------------------------DataError Traceback (most recent call last)<ipython-input-25-0b536a966e02> in <module>()----> 1 grouped.nx.mean()/usr/local/lib/python2.7/dist-packages/pandas-0.12.0-py2.7-linux-i686.egg/pandas/core/groupby.pyc in mean(self) 357 """ 358 try:--> 359 return self._cython_agg_general('mean') 360 except GroupByError: 361 raise/usr/local/lib/python2.7/dist-packages/pandas-0.12.0-py2.7-linux-i686.egg/pandas/core/groupby.pyc in _cython_agg_general(self, how, numeric_only) 462 463 if len(output) == 0:--> 464 raise DataError('No numeric types to aggregate') 465 466 return self._wrap_aggregated_output(output, names)DataError: No numeric types to aggregateHas anyone any ideas? 解决方案 A Series where each entry is itself a Series is not idiomatic. I think "No numeric types to aggregate" is telling you that pandas is trying to take the average of a list of Series (not the average of the numeric data they contain) which is not defined.You should organize your data so nx and ny contain actual numbers. It might be simplest to keep nx, ny, (and, I think, rx and ry) in a separate DataFrame, where each column corresponds to one id. 这篇关于使用.groupby()的 pandas 时间序列的平均值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！