我试图计算存储在两列中两个日期之间的天数,但仅计算从五月到八月(树木的生长季节)的天数,并填充一个新列:

import pandas as pd
import numpy as np
from datetime import datetime

df = pd.DataFrame(columns=['Start','End'],data=[[np.datetime64('2001-01-01'),np.datetime64('2001-07-01')],[np.datetime64('2001-01-01'),np.datetime64('2001-11-01')]])

def vegetation_days(date1, date2):
    startdate=date1.astype(datetime)
    enddate=date2.astype(datetime)
    all_dates = (startdate + datetime.timedelta(days=x) for x in range(0, (enddate-startdate).days))
    return (sum(1 for date in all_dates if (5 <= date.month <=7)))


然后:

df:

       Start        End
0 2001-01-01 2001-07-01
1 2001-01-01 2001-11-01

df['Days'] = vegetation_days(df['Start'],df['End'])


这给了我错误:


  AttributeError:“系列”对象没有属性“天”


我怎样才能解决这个问题?

最佳答案

使用DataFrame.apply

def vegetation_days(date1, date2):
    all_dates = (date1 + pd.Timedelta(days=x) for x in range(0, (date2-date1).days))
    return (sum(1 for date in all_dates if (5 <= date.month <=7)))

df['Days'] = df.apply(lambda x: vegetation_days(x['Start'], x['End']), axis=1)
print (df)
       Start        End  Days
0 2001-01-01 2001-07-01    61
1 2001-01-01 2001-11-01    92

关于python - 通过函数计算其他两列之间的天数来创建结果的新列,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/47412394/

10-12 18:23