问题描述
我在numpy中有一个不对称的2d数组,因为某些数组比其他数组长,例如:[[1, 2], [1, 2, 3], ...]
I have an asymmetric 2d array in numpy, as in some arrays are longer than others, such as: [[1, 2], [1, 2, 3], ...]
但是numpy似乎不喜欢这样:
But numpy doesn't seem to like this:
import numpy as np
foo = np.array([[1], [1, 2]])
foo.mean(axis=1)
跟踪:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/tom/.virtualenvs/nlp/lib/python3.5/site-packages/numpy/core/_methods.py", line 56, in _mean
rcount = _count_reduce_items(arr, axis)
File "/home/tom/.virtualenvs/nlp/lib/python3.5/site-packages/numpy/core/_methods.py", line 50, in _count_reduce_items
items *= arr.shape[ax]
IndexError: tuple index out of range
是否有一个很好的方法可以做到这一点?或者我应该自己做数学吗?
Is there a nice way to do this or should I just do the maths myself?
推荐答案
我们可以使用基于 np.add.reduceat
负责处理不规则长度的子数组,我们将为其计算平均值. np.add.reduceat
在使用np.concatenate
获得输入数组的1D
展平版本之后,以不规则长度的间隔对元素求和.最后,我们需要将总和除以这些子数组的长度以获得平均值.
We could use an almost vectorized approach based upon np.add.reduceat
that takes care of the irregular length subarrays, for which we are calculating the average values. np.add.reduceat
sums up elements in those intervals of irregular lengths after getting a 1D
flattened version of the input array with np.concatenate
. Finally, we need to divide the summations by the lengths of those subarrays to get the average values.
因此,实现看起来像这样-
Thus, the implementation would look something like this -
lens = np.array(map(len,foo)) # Thanks to @Kasramvd on this!
vals = np.concatenate(foo)
shift_idx = np.append(0,lens[:-1].cumsum())
out = np.add.reduceat(vals,shift_idx)/lens.astype(float)
这篇关于numpy中不对称数组的均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!