似乎在Pandas中,您可以执行以下任一操作:

age_is_null = pd.isnull(titanic_survival["age"])
age_is_null = titanic_survival["age"].isnull()


似乎两者都存在:Pandas模块中的函数和Dataframe类中的方法(在另一个模块中)。

来自Obj-C背景,这很令人困惑。为什么同时需要两者?

最佳答案

pd.isnull适用于不同类型(可迭代的任何类型)的输入,例如

>>> import pandas as pd
>>> import numpy as np
>>> pd.isnull(np.array([1, 2]))
array([False, False], dtype=bool)
>>> pd.isnull([1, 2])
array([False, False], dtype=bool)


df.isnull是绑定到DataFrame对象的成员函数。因此,每当第一次创建DataFrame会导致成本高昂时,您都将使用pd.isnull

时间:

In [30]: %timeit pd.isnull([1,2])
The slowest run took 8.93 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 9.19 µs per loop

In [33]: %timeit pd.DataFrame([1,2]).isnull()
The slowest run took 6.42 times longer than the fastest. This could mean that an intermediate result is being cached.
1000 loops, best of 3: 202 µs per loop

10-06 05:18
查看更多