本文介绍了在一个numpy数组中查找连续的重复nan的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
在numpy数组中找到最大连续重复nan的最佳方法是什么?
What is the best way to find the maximum number of consecutive repeated nan in a numpy array?
示例:
from numpy import nan
输入1:[nan, nan, nan, 0.16, 1, 0.16, 0.9999, 0.0001, 0.16, 0.101, nan, 0.16]
输出1:3
输入2:[nan, nan, 2, 1, 1, nan, nan, nan, nan, 0.101, nan, 0.16]
输出2:4
推荐答案
这是一种方法-
def max_repeatedNaNs(a):
# Mask of NaNs
mask = np.concatenate(([False],np.isnan(a),[False]))
if ~mask.any():
return 0
else:
# Count of NaNs in each NaN group. Then, get max count as o/p.
c = np.flatnonzero(mask[1:] < mask[:-1]) - \
np.flatnonzero(mask[1:] > mask[:-1])
return c.max()
这是改进版-
def max_repeatedNaNs_v2(a):
mask = np.concatenate(([False],np.isnan(a),[False]))
if ~mask.any():
return 0
else:
idx = np.nonzero(mask[1:] != mask[:-1])[0]
return (idx[1::2] - idx[::2]).max()
根据 @pltrdy's comment
进行基准测试a>-
Benchmarking in response to @pltrdy's comment
-
In [77]: a = np.random.rand(10000)
In [78]: a[np.random.choice(range(len(a)),size=1000,replace=0)] = np.nan
In [79]: %timeit contiguous_NaN(a) #@pltrdy's solution
100 loops, best of 3: 15.8 ms per loop
In [80]: %timeit max_repeatedNaNs(a)
10000 loops, best of 3: 103 µs per loop
In [81]: %timeit max_repeatedNaNs_v2(a)
10000 loops, best of 3: 86.4 µs per loop
这篇关于在一个numpy数组中查找连续的重复nan的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!