问题描述
给定以下数据框:
df = pd.DataFrame({'A' : ['1','2','3','7'],
'B' : [7,6,5,4],
'C' : [5,6,7,1],
'D' : [1,9,9,8]})
df=df.set_index('A')
df
B C D
A
1 7 5 1
2 6 6 9
3 5 7 9
7 4 1 8
我正在尝试计算 复合年增长率 (CAGR).我试图避免使用列名.这是我想出的:
I am attempting to calculate the compound annual growth rate (CAGR).I am trying to avoid using the column names.Here's what I came up with:
df['CAGR']=((df[df.columns[-1:]]/df[df.columns[:1]])**(1/len(df.columns)))-1
然而,它抛出这个错误:
However, it throws this error:
ValueError: Wrong number of items passed 2, placement implies 1
我测试了公式的每个部分,它返回了我需要的列,所以我很难过.
I tested each part of the formula and it returned the columns I needed, so I'm stumped.
提前致谢!
推荐答案
您正在对 DataFrame
进行切片,使得返回对象是 DataFrame
You are slicing the DataFrame
in such a way that the return object is a DataFrame
df[df.columns[-1:]]
-1:
导致 df.columns[-1:]
返回 [column_name]
而不是 column_name代码>.因此,
df[df.columns[-1:]]
是一个 DataFrame
.这意味着当您尝试进行除法时,pandas
会尝试排列索引,包括列.为了解决这个问题.你本来可以这样做的:
The -1:
results in df.columns[-1:]
returning [column_name]
instead of column_name
. As a consequence, df[df.columns[-1:]]
is a DataFrame
. What that means is that when you try to do the division, pandas
tries to line up the indices, columns included. To get around this. You could have just done:
df[df.columns[-1]]
使用 -1
而不是 -1:
但是,我会这样做.
df['CAGR'] = df.iloc[:, -1].div(df.iloc[:, 0]).pow(1./(len(df.columns) - 1)).sub(1)
print df
B C D CAGR
A
1 7 5 1 -0.622036
2 6 6 9 0.224745
3 5 7 9 0.341641
7 4 1 8 0.414214
这篇关于Pandas 用切片计算 CAGR的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!