问题描述
我正在尝试将pandas Dataframe(orders_df)中的两个现有列相乘-价格(股票收盘价)和Amount(股票数量),并将计算结果添加到名为值"的新列中.由于某种原因,当我运行此代码时,值"列下的所有行均为正数,而某些行应为负数.在DataFrame的操作"列下,有七个带有出售"字符串的行和七个带有购买"字符串的行.
I'm trying to multiply two existing columns in a pandas Dataframe (orders_df) - Prices (stock close price) and Amount (stock quantities) and add the calculation to a new column called 'Value'. For some reason when I run this code, all the rows under the 'Value' column are positive numbers, while some of the rows should be negative. Under the Action column in the DataFrame there are seven rows with the 'Sell' string and seven with the 'Buy' string.
for i in orders_df.Action:
if i == 'Sell':
orders_df['Value'] = orders_df.Prices*orders_df.Amount
elif i == 'Buy':
orders_df['Value'] = -orders_df.Prices*orders_df.Amount)
请让我知道我做错了!
推荐答案
如果我们愿意牺牲海顿解决方案的简洁性,则还可以执行以下操作:
If we're willing to sacrifice the succinctness of Hayden's solution, one could also do something like this:
In [22]: orders_df['C'] = orders_df.Action.apply(
lambda x: (1 if x == 'Sell' else -1))
In [23]: orders_df # New column C represents the sign of the transaction
Out[23]:
Prices Amount Action C
0 3 57 Sell 1
1 89 42 Sell 1
2 45 70 Buy -1
3 6 43 Sell 1
4 60 47 Sell 1
5 19 16 Buy -1
6 56 89 Sell 1
7 3 28 Buy -1
8 56 69 Sell 1
9 90 49 Buy -1
现在,我们不再需要if
语句.使用DataFrame.apply()
,我们还取消了for
循环.正如Hayden所指出的,矢量化运算总是更快.
Now we have eliminated the need for the if
statement. Using DataFrame.apply()
, we also do away with the for
loop. As Hayden noted, vectorized operations are always faster.
In [24]: orders_df['Value'] = orders_df.Prices * orders_df.Amount * orders_df.C
In [25]: orders_df # The resulting dataframe
Out[25]:
Prices Amount Action C Value
0 3 57 Sell 1 171
1 89 42 Sell 1 3738
2 45 70 Buy -1 -3150
3 6 43 Sell 1 258
4 60 47 Sell 1 2820
5 19 16 Buy -1 -304
6 56 89 Sell 1 4984
7 3 28 Buy -1 -84
8 56 69 Sell 1 3864
9 90 49 Buy -1 -4410
此解决方案采用两行代码,而不是一行,但更易于阅读.我怀疑计算成本也差不多.
This solution takes two lines of code instead of one, but is a bit easier to read. I suspect that the computational costs are similar as well.
这篇关于我想将pandas DataFrame中的两列相乘并将结果添加到新列中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!