问题描述
基本上,我正在做一些数据分析.我以numpy.ndarray的形式读取数据集,但缺少某些值(要么就是不在那里,要么是NaN
,要么是写为"NA
"的字符串).
Basically, I'm doing some data analysis. I read in a dataset as a numpy.ndarray and some of the values are missing (either by just not being there, being NaN
, or by being a string written "NA
").
我想清除包含此类条目的所有行.我该如何使用numpy ndarray?
I want to clean out all rows containing any entry like this. How do I do that with a numpy ndarray?
推荐答案
>>> a = np.array([[1,2,3], [4,5,np.nan], [7,8,9]])
array([[ 1., 2., 3.],
[ 4., 5., nan],
[ 7., 8., 9.]])
>>> a[~np.isnan(a).any(axis=1)]
array([[ 1., 2., 3.],
[ 7., 8., 9.]])
,然后将其重新分配给a
.
and reassign this to a
.
说明:np.isnan(a)
返回与True
类似的数组,其中NaN
,False
在其他位置. .any(axis=1)
通过对整个行进行逻辑or
操作将m*n
数组减少为n
,~
反转True/False
,并且a[ ]
仅从原始数组中选择具有True
的行在方括号内.
Explanation: np.isnan(a)
returns a similar array with True
where NaN
, False
elsewhere. .any(axis=1)
reduces an m*n
array to n
with an logical or
operation on the whole rows, ~
inverts True/False
and a[ ]
chooses just the rows from the original array, which have True
within the brackets.
这篇关于如何删除numpy.ndarray中包含非数字值的所有行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!