我有这个CSV文件:

index   empno   ename   job mgr hiredate    sal comm    deptno
0,  7839,   KING,   PRESIDENT,  0,  1981-11-17,     5000,   0,  10
1,  7698,   BLAKE,  MANAGER,    7839,   1981-05-01, 2850,   0,  30
2,  7782,   CLARK,  MANAGER,    7839,   1981-05-09, 2450,   0,  10
3,  7566,   JONES,  MANAGER,    7839,   1981-04-01, 2975,   0,  20
4,  7654,   MARTIN, SALESMAN,   7698,   1981-09-10, 1250,   1400,   30
5,  7499,   ALLEN,  SALESMAN,   7698,   1981-02-11, 1600    300,    30
6,  7844,   TURNER, SALESMAN,   7698,   1981-08-21, 1500,   0,  30
7,  7900,   JAMES,  CLERK,      7698,   1981-12-11, 950,    0,  30
8,  7521,   WARD,   SALESMAN,   7698,   1981-02-23, 1250,   500,    30
9,  7902,   FORD,   ANALYST,    7566,   1981-12-11, 3000,   0,  20
10, 7369,   SMITH,  CLERK,      7902,   1980-12-09, 800,    0,  20
11, 7788,   SCOTT,  ANALYST,    7566    1982-12-22, 3000,   0,  20
12, 7876,   ADAMS,  CLERK,      7788,   1983-01-15, 1100,   0,  20
13, 7934,   MILLER, CLERK,      7782,   1982-01-11, 1300,   0,  10

使用下面的代码,我得到所有的emp.csv
import csv
import sys
import pandas as pd
import dateutil

# Load data from csv file
emp = pd.DataFrame.from_csv("D:\R data\emp.csv")
# Convert date from string to date times`enter code here`
emp['hiredate'] = emp['hiredate'].apply(dateutil.parser.parse, dayfirst=True)
jonessal = emp[['sal']][emp['ename']=='JONES']
empename = emp[['ename','sal']][emp['sal'] > jonessal]
print(empename)

这是代码的输出:
index
0       NaN  NaN
1       NaN  NaN
2       NaN  NaN
3       NaN  NaN
4       NaN  NaN
5       NaN  NaN
6       NaN  NaN
7       NaN  NaN
8       NaN  NaN
9       NaN  NaN
10      NaN  NaN
11      NaN  NaN
12      NaN  NaN
13      NaN  NaN

我想要的输出是:
index
0       KING  5000
9       FORD  3000
11     SCOTT  3000

我以为变量的值是2975,但结果是NaN
如果我用jonesal对工资进行硬编码,它可以正常工作,但是当我使用变量时,它会返回所有NaN:NaN

最佳答案

jonessal是一个数据帧。

emp[['ename','sal']][emp['sal'] > jonessal]

这里,比较emp['sal'] > jonessal不是针对标量的,由于brodcast,它返回一个奇怪的数据帧。由于索引/形状不匹配,最终结果由nan组成。
这里,你假设只有一个叫琼斯的雇员。遵循相同的假设,可以使用以下命令获取标量:
jonessal = emp.loc[emp['ename']=='JONES', 'sal'].values[0]

.values返回一个数组,[0]来自单个员工假设。)
现在,它将返回相同的结果:
emp[['ename','sal']][emp['sal'] > jonessal]
Out[81]:
    ename   sal
0    KING  5000
9    FORD  3000
11  SCOTT  3000

关于python - 为什么 bool 索引返回所有NaN,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/38291386/

10-11 22:06
查看更多