问题描述
我正在尝试根据现有列中的值满足两个条件来创建派生列.需要满足的条件之一是其中一列的值不能缺少日期时间值 (NaT).我不断收到一个属性错误,指出 Timestamp 对象没有属性 isnull,我无法理解如何修复它.
I'm trying to create a derived column based on two conditions being met for values in existing columns. One of the conditions that needs to be met is that that value for one of the columns cannot have a datetime value that is missing (NaT). I keep receiving an attribute error that the Timestamp object has no attribute isnull and I cannot understand how to fix it.
我通过根据我尝试包含的条件过滤我的 DataFrame 来检查我的条件语句是否正确.
I checked that my conditional statement was correct by filtering my DataFrame on the conditions that I'm trying to include and that was successful.
这是我的 df 包含的示例:
Here is a sample of what my df contains:
我选择创建一个可以使用 df.apply() 应用的函数,因为这是我将定期执行的数据清理过程.
I'm choosing to create a function that I can apply using df.apply() because this is a data cleaning process i'll be doing regularly.
我正在尝试使用以下条件创建一个名为case_start_time"的新字段:
I'm trying to create a new field titled "case_start_time" with the following conditions:
函数中使用的代码:
def case_start(df):
if df[(df['procedure_type_zc'] == 'Infusion') & (df['line_start_time'].isnull() )]:
return df['check_in']
else:
return 'Undefined'
当将此函数应用于 df 以创建新字段时:
And when applying this function to df to create a new field:
df['case_start_time'] = df.apply(case_start, axis = 1)
我收到以下错误:
AttributeError: ("'Timestamp' 对象没有属性 'isnull'",'发生在索引 0')
这些是我的 df 中值的 dtypes:
These are the dtypes for the values in my df:
csn int64
line_start_time datetime64[ns]
procedure_type_zc object
dtype: object
在做了一些研究之后,我发现我可以将 .isnull() 应用于 Pandas 中的日期时间值,这就是为什么我不确定如何解决错误.
After doing some research I found that I can apply .isnull() to a datetime value in pandas which is why i'm not sure how to resolve the error.
这是我用来为两种情况过滤 DataFrame 的代码:
This is the code that I used to filter the DataFrame for both conditions:
missing_line_time = sample_df[ (sample_df['procedure_type_zc'] == 'Infusion') & (sample_df['line_start_time'].isnull()) ]
根据我附上sample_df的图片,这个逻辑是正确的.
Based on the image I attached with the sample_df, this logic is correct.
推荐答案
我遇到了类似的问题.这对我有用:
I was running into a similar problem. This worked for me:
而不是使用:
(sample_df['line_start_time'].isnull())
使用:
(sample_df['line_start_time'] is pd.NaT)
希望至少可以消除您当前的错误.
hopefully that at least gets rid of your current error.
这篇关于连续检查缺失的日期时间值的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!