我想找到一种更快的方法来计算下面每个产品的52周前销售额而不使用iterrows或itetuples。有什么建议么?输入将是不含“ 52周前的销售额”列的表,输出将是下面的整个表。
date sales city product sales 52 weeks ago
0 2020-01-01 1.5 c1 p1 0.6
1 2020-01-01 1.2 c1 p2 0.3
2 2019-05-02 0.5 c1 p1 nan
3 2019-01-02 0.3 c1 p2 nan
4 2019-01-02 0.6 c1 p1 nan
5 2019-01-01 1.2 c1 p2 nan
示例itertuples代码,但速度很慢:
for row in df.itertuples(index=True, name='Pandas'):
try:
df.at[row.Index, 'sales 52 weeks ago']=df[(df['date']==row.date-timedelta(weeks=52))&(df['product']==row.product),'sales']
except:
continue
最佳答案
用Timedelta
减去日期后需要合并:
m=df['date'].sub(pd.Timedelta('52W')).to_frame().assign(product=df['product'])
final = df.assign(sales_52_W_ago=m.merge(df,
on=['date','product'],how='left').loc[:,'sales'])
date sales city product sales_52_W_ago
0 2020-01-01 1.5 c1 p1 0.6
1 2020-01-01 1.2 c1 p2 0.3
2 2019-05-02 0.5 c1 p1 NaN
3 2019-01-02 0.3 c1 p2 NaN
4 2019-01-02 0.6 c1 p1 NaN
5 2019-01-01 1.2 c1 p2 NaN
关于python - 通过查找其他行中的值来创建新的pandas数据框列,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/59573335/