python - 根据每一行的第一个元素返回NumPy数组的子集

我正在尝试获取给定 NumPy 数组 alist 的子集 x，以便每行的第一个元素必须在列表 r 中。

>>> import numpy
>>> alist = numpy.array([(0, 2), (0, 4), (1, 3), (1, 4), (2, 1), (3, 1), (3, 2), (4, 1), (4, 3), (4, 2)])
>>> alist
array([[0, 2],
   [0, 4],
   [1, 3],
   [1, 4],
   [2, 1],
   [3, 1],
   [3, 2],
   [4, 1],
   [4, 3],
   [4, 2]])
>>> r = [1,3]
>>> x = alist[where first element of each row is in r] #this i need to figure out.
>>> x
array([[1, 3],
   [1, 4],
   [3, 1],
   [3, 2]])

任何简单的方法(没有循环，因为我有一个大数据集)在 Python 中做到这一点？

最佳答案

从输入数组中切出第一列(基本上从每一行中选择第一个元素)，然后使用 np.in1d 和 r 作为第二个输入来创建这些有效行的掩码，最后用掩码索引到数组的行中以选择有效那些。

因此，实现将是这样 -

alist[np.in1d(alist[:,0],r)]

sample 运行 -

In [258]: alist   # Input array
Out[258]:
array([[0, 2],
       [0, 4],
       [1, 3],
       [1, 4],
       [2, 1],
       [3, 1],
       [3, 2],
       [4, 1],
       [4, 3],
       [4, 2]])

In [259]: r  # Input list to be searched for
Out[259]: [1, 3]

In [260]: np.in1d(alist[:,0],r) # Mask of valid rows
Out[260]: array([False, False,  True,  True, False,  True,  True,
                        False, False, False], dtype=bool)

In [261]: alist[np.in1d(alist[:,0],r)] # Index and select for final o/p
Out[261]:
array([[1, 3],
       [1, 4],
       [3, 1],
       [3, 2]])

关于python - 根据每一行的第一个元素返回NumPy数组的子集，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/41241675/