问题描述
最终我想计算从 df['start']
中的每个日期到本月最后一天的天数并填充 'count'
列结果.
Ultimately I want to calculate the number of days to the last day of the month from every date in df['start']
and populate the 'count'
column with the result.
作为实现该目标的第一步,calendar.monthrange
方法接受(年,月)参数并返回一个(第一个工作日,天数)元组.
As a first step towards that goal the calendar.monthrange
method takes (year, month) arguments and returns a (first weekday, number of days) tuple.
将函数应用于数据框或系列对象似乎存在普遍错误.我想了解,为什么这不起作用.
There seems to be a general mistake regarding applying functions to dataframes or series objects. I would like to understand, why this isn't working.
import numpy as np
import pandas as pd
import calendar
def last_day(row):
return calendar.monthrange(row['start'].dt.year, row['start'].dt.month)
这一行引发了一个 AttributeError: "Timestamp object has no attribute 'dt'":
This line raises an AttributeError: "Timestamp object has no attribute 'dt'":
df['count'] = df.apply(last_day, axis=1)
这是我的数据框的样子:
this is what my dataframe looks like:
start count
0 2016-02-15 NaN
1 2016-02-20 NaN
2 2016-04-23 NaN
df.dtypes
start datetime64[ns]
count float64
dtype: object
推荐答案
删除 .dt
.这在访问某种向量时通常是需要的.但是当访问单个元素时,它已经是一个 datetime
对象:
Remove the .dt
. This is generally needed when accessing a vector of some sort. But when accessing an individual element it will already be a datetime
object:
def last_day(row):
return calendar.monthrange(row['start'].year, row['start'].month)
为什么:
这个 apply
调用 last_day
并传递一个系列.
Why:
This apply
calls last_day
and passes a Series.
df['count'] = df.apply(last_day, axis=1)
在 last_day
中,然后选择系列的单个元素:
In last_day
you then select a single element of the series:
row['start'].year
这篇关于将函数应用于数据框;时间戳.dt的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!