本文介绍了在算术运算中将 NaN 视为零?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是一个我正在努力解决的事情的简单例子:

Here's a simple example of the sort of thing I'm wrestling with:

In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: test = pd.DataFrame(np.random.randn(4,4),columns=list('ABCD'))
In [4]: for i in range(4):
  ....:    test.iloc[i,i] = np.nan

In [5]: test
Out[5]:
           A         B         C         D
0        NaN  0.136841 -0.854138 -1.890888
1  -1.261724       NaN  0.875647  1.312823
2   1.130999 -0.208402       NaN  0.256644
3  -0.158458 -0.305250  0.902756       NaN

现在,如果我使用 sum 对行求和,所有 NaN 值都被视为零:

Now, if I use sum to sum the rows, all the NaN values are treated as zeros:

In [6]: test['Sum'] = test.loc[:,'A':'D'].sum(axis=1)

In [7]: test
Out[7]:
          A         B         C         D       Sum
0       NaN  0.136841 -0.854138 -1.890888 -2.608185
1 -1.261724       NaN  0.875647  1.312823  0.926745
2  1.130999 -0.208402       NaN  0.256644  1.179241
3 -0.158458 -0.305250  0.902756       NaN  0.439048

但就我而言,我可能需要先对这些值做一些工作;例如缩放它们:

But in my case, I may need to do a bit of work on the values first; for example scaling them:

In [8]: test['Sum2'] = test.A + test.B/2 - test.C/3 + test.D

In [9]: test
Out[9]:
          A         B         C         D       Sum  Sum2
0       NaN  0.136841 -0.854138 -1.890888 -2.608185   NaN
1 -1.261724       NaN  0.875647  1.312823  0.926745   NaN
2  1.130999 -0.208402       NaN  0.256644  1.179241   NaN
3 -0.158458 -0.305250  0.902756       NaN  0.439048   NaN

如您所见,NaN 值传递到算术中以生成 NaN 输出,这正是您所期望的.

As you see, the NaN values carry across into the arithmetic to produce NaN output, which is what you'd expect.

现在,我不想用零替换数据帧中的所有 NaN 值:区分零和 NaN 对我很有帮助.我可以用其他东西替换 NaN:我正在处理大量学生成绩,我需要区分零成绩和 NaN我用来表示未尝试特定评估任务的时刻.(它取代了传统电子表格中的空白单元格.)但是无论我用什么替换 NaN 值,它都需要在我可能的操作中被视为零履行.我在这里有哪些选择?

Now, I don't want to replace all NaN values in my dataframe with zeros: it is helpful to me to distinguish between zero and NaN. I could replace NaN with something else: I'm dealing with large volumes of student grades, and i need to distinguish between a grade of zero, and a NaN which at the moment I'm using to indicate that the particular assessment task was not attempted. (It takes the place of what would be a blank cell in a traditional spreadsheet.) But whatever I replace the NaN values with, it needs to be something that can be treated as zero in the operations I may perform. What are my options here?

推荐答案

使用fillna功能

test['Sum2'] = test.A.fillna(0) + test.B.fillna(0)/2 - test.C.fillna(0)/3 + test.D.fillna(0)

这篇关于在算术运算中将 NaN 视为零?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-31 02:55