我有一个DF,如下所示:

Name | Gender | Age | Apple | Banana | Mango | Watermelon | Kiwi
----------------------------------------------------------------
Jack | Male   | 20  | 2     | 3      | 10    |            |
Jen  | Female | 25  | 5     |        |       | 5          | 1
Jill | Female | 22  | 5     | 3      |       | 5          |
John | Male   | 21  | 6     |        |       |            |
Joe  | Male   | 28  | 2     | 3      |       | 5          |
Jim  | Male   | 26  | 2     | 3      |       |            |


我想查找所有列中的非空单元格计数,按“性别”分组。

换句话说,所需的输出将具有:

Fruits     | Total | Male | Female |
------------------------------------
Apple      | 6     | 4    | 3      |
Banana     | 4     | 3    | 1      |
Mango      | 1     | 1    | 0      |
Watermelon | 3     | 2    | 1      |
Kiwi       | 1     | 0    | 1      |
-------------------------------------
Total      | 16     | 10   | 6


请注意:

>> print type(df.iloc[1,4])
<type 'str'>


因此,有一个空字符串,我不能用fillna()方法填充它吗?

最佳答案

使用drop + replace + count + T + insert

df1 = df.drop(['Name', 'Age'], 1)
df = df1.replace({'':np.nan, 0:np.nan}).groupby('Gender').count().T
df.insert(0, 'Total', df.sum(1))
df.loc['Total'] = df.sum()
print (df)
Gender      Total  Female  Male
Apple           6       2     4
Banana          4       1     3
Mango           1       0     1
Watermelon      3       2     1
Kiwi            1       1     0
Total          15       6     9


另外,如果需要更改列顺序,请添加reindex_axis

df1 = df.drop(['Name', 'Age'], 1)
df = df1.replace({'':np.nan, 0:np.nan}).groupby('Gender').count().T
df['Total'] = df.sum(1)
df.loc['Total'] = df.sum()
df = df.reindex_axis(['Total','Male','Female'], 1)
print (df)
Gender      Total  Male  Female
Apple           6     4       2
Banana          4     3       1
Mango           1     1       0
Watermelon      3     1       2
Kiwi            1     0       1
Total          15     9       6

关于python - 计算每个 Pandas 列中的非空/非零行条目,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/43919046/

10-12 21:57