我有一个DF,如下所示:
Name | Gender | Age | Apple | Banana | Mango | Watermelon | Kiwi
----------------------------------------------------------------
Jack | Male | 20 | 2 | 3 | 10 | |
Jen | Female | 25 | 5 | | | 5 | 1
Jill | Female | 22 | 5 | 3 | | 5 |
John | Male | 21 | 6 | | | |
Joe | Male | 28 | 2 | 3 | | 5 |
Jim | Male | 26 | 2 | 3 | | |
我想查找所有列中的非空单元格计数,按“性别”分组。
换句话说,所需的输出将具有:
Fruits | Total | Male | Female |
------------------------------------
Apple | 6 | 4 | 3 |
Banana | 4 | 3 | 1 |
Mango | 1 | 1 | 0 |
Watermelon | 3 | 2 | 1 |
Kiwi | 1 | 0 | 1 |
-------------------------------------
Total | 16 | 10 | 6
请注意:
>> print type(df.iloc[1,4])
<type 'str'>
因此,有一个空字符串,我不能用
fillna()
方法填充它吗? 最佳答案
使用drop
+ replace
+ count
+ T
+ insert
:
df1 = df.drop(['Name', 'Age'], 1)
df = df1.replace({'':np.nan, 0:np.nan}).groupby('Gender').count().T
df.insert(0, 'Total', df.sum(1))
df.loc['Total'] = df.sum()
print (df)
Gender Total Female Male
Apple 6 2 4
Banana 4 1 3
Mango 1 0 1
Watermelon 3 2 1
Kiwi 1 1 0
Total 15 6 9
另外,如果需要更改列顺序,请添加
reindex_axis
:df1 = df.drop(['Name', 'Age'], 1)
df = df1.replace({'':np.nan, 0:np.nan}).groupby('Gender').count().T
df['Total'] = df.sum(1)
df.loc['Total'] = df.sum()
df = df.reindex_axis(['Total','Male','Female'], 1)
print (df)
Gender Total Male Female
Apple 6 4 2
Banana 4 3 1
Mango 1 1 0
Watermelon 3 1 2
Kiwi 1 0 1
Total 15 9 6
关于python - 计算每个 Pandas 列中的非空/非零行条目,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/43919046/