我有以下格式的矩阵:
matrix = np.array([1, 2, 3, np.nan],
[1, np.nan, 3, 4],
[np.nan, 2, 3, np.nan])
和系数,我想选择性地与矩阵乘以元素:
coefficients = np.array([0.5, np.nan, 0.2, 0.3],
[0.3, 0.3, 0.2, np.nan],
[np.nan, 0.2, 0.1, np.nan])
在这种情况下,我希望将
matrix
中的第一行与coefficients
中的第二行相乘,而matrix
中的第二行将与coefficients
中的第一行相乘。简而言之,我想根据coefficients
值的位置选择与matrix
中的行匹配的np.nan
中的行。np.nan
值的位置对于coefficients
中的每一行都会有所不同,因为它们描述了不同数据可用性情况下的系数。有没有一种快速的方法可以执行此操作,而无需为所有可能的情况编写if语句?
最佳答案
方法1
一种快速的方法是NumPy broadcasting
-
# Mask of NaNs
mask1 = np.isnan(matrix)
mask2 = np.isnan(coefficients)
# Perform comparison between each row of mask1 against every row of mask2
# leading to a 3D array. Look for all-matching ones along the last axis.
# These are the ones that shows the row matches between the two input arrays -
# matrix and coefficients. Then, we use find the corresponding matching
# indices that gives us the pair of matches betweel those two arrays
r,c = np.nonzero((mask1[:,None] == mask2).all(-1))
# Index into arrays with those indices and perform elementwise multiplication
out = matrix[r] * coefficients[c]
给定样本数据的输出-
In [40]: out
Out[40]:
array([[ 0.3, 0.6, 0.6, nan],
[ 0.5, nan, 0.6, 1.2],
[ nan, 0.4, 0.3, nan]])
方法#2
为了提高性能,请将NaNs掩码的每一行减少到十进制等效值,然后创建一个存储数组,我们可以在其中存储
matrix
以外的元素,然后乘以这些十进制等效项索引的coefficients
元素-R = 2**np.arange(matrix.shape[1])
idx1 = mask1.dot(R)
idx2 = mask2.dot(R)
A = np.empty((idx1.max()+1, matrix.shape[1]))
A[idx1] = matrix
A[idx2] *= coefficients
out = A[idx1]
关于python - 通过模式匹配相乘,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/43921338/