问题描述
这是一个我正在努力解决的事情的简单例子:
Here's a simple example of the sort of thing I'm wrestling with:
In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: test = pd.DataFrame(np.random.randn(4,4),columns=list('ABCD'))
In [4]: for i in range(4):
....: test.iloc[i,i] = np.nan
In [5]: test
Out[5]:
A B C D
0 NaN 0.136841 -0.854138 -1.890888
1 -1.261724 NaN 0.875647 1.312823
2 1.130999 -0.208402 NaN 0.256644
3 -0.158458 -0.305250 0.902756 NaN
现在,如果我使用 sum
对行求和,所有 NaN
值都被视为零:
Now, if I use sum
to sum the rows, all the NaN
values are treated as zeros:
In [6]: test['Sum'] = test.loc[:,'A':'D'].sum(axis=1)
In [7]: test
Out[7]:
A B C D Sum
0 NaN 0.136841 -0.854138 -1.890888 -2.608185
1 -1.261724 NaN 0.875647 1.312823 0.926745
2 1.130999 -0.208402 NaN 0.256644 1.179241
3 -0.158458 -0.305250 0.902756 NaN 0.439048
但就我而言,我可能需要先对这些值做一些工作;例如缩放它们:
But in my case, I may need to do a bit of work on the values first; for example scaling them:
In [8]: test['Sum2'] = test.A + test.B/2 - test.C/3 + test.D
In [9]: test
Out[9]:
A B C D Sum Sum2
0 NaN 0.136841 -0.854138 -1.890888 -2.608185 NaN
1 -1.261724 NaN 0.875647 1.312823 0.926745 NaN
2 1.130999 -0.208402 NaN 0.256644 1.179241 NaN
3 -0.158458 -0.305250 0.902756 NaN 0.439048 NaN
如您所见,NaN
值传递到算术中以生成 NaN
输出,这正是您所期望的.
As you see, the NaN
values carry across into the arithmetic to produce NaN
output, which is what you'd expect.
现在,我不想用零替换数据帧中的所有 NaN
值:区分零和 NaN
对我很有帮助.我可以用其他东西替换 NaN
:我正在处理大量学生成绩,我需要区分零成绩和 NaN
我用来表示未尝试特定评估任务的时刻.(它取代了传统电子表格中的空白单元格.)但是无论我用什么替换 NaN
值,它都需要在我可能的操作中被视为零履行.我在这里有哪些选择?
Now, I don't want to replace all NaN
values in my dataframe with zeros: it is helpful to me to distinguish between zero and NaN
. I could replace NaN
with something else: I'm dealing with large volumes of student grades, and i need to distinguish between a grade of zero, and a NaN
which at the moment I'm using to indicate that the particular assessment task was not attempted. (It takes the place of what would be a blank cell in a traditional spreadsheet.) But whatever I replace the NaN
values with, it needs to be something that can be treated as zero in the operations I may perform. What are my options here?
推荐答案
使用fillna功能
test['Sum2'] = test.A.fillna(0) + test.B.fillna(0)/2 - test.C.fillna(0)/3 + test.D.fillna(0)
这篇关于在算术运算中将 NaN 视为零?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!