python - Pandas 与groupby的对角线值

我想总结每年和残差的对角线值，按对象分组。例如，对象a将为1 + 10 + 11 + 12 +13。是否有任何方法可以不按对象拆分表？请注意，每个对象的行数可能不同。我试过了：
df.groupby（'Company'）。apply（lambda x：x.reset_index（）。loc [0，'Year_0'] + x.reset_index（）。loc [1，'Year_1'] + x.reset_index（）。 loc [2，'Year_2'] + x.reset_index（）。loc [3，'Year_3']），但需要定义的行数。谢谢！

Year_0  Year_1  Year_2  Year_3  Residue Company
1       0.0     0.0     0.0      10      a
1       10      0.0     0.0      10      a
1       10       11     0.0      10      a
1       10       11      12      13      a
2       0      0.0      0.0      12      b
2       11     0.0      0.0      12      b
2       11      12      0.0      12      b
2       11      12       13      12      b
-3     0       0.0      0.0      -1      c
-3     -1       0.0     0.0      -1      c
-3     -2       -3      0.0      -1      c

最佳答案

我相信您需要drop_duplicates，通过set_index，sum行和最后一个reset_index创建索引，以便将Series转换为DataFrame：

df1 = (df.drop_duplicates('Company', keep='last')
         .set_index('Company')
         .sum(axis=1)
         .reset_index(name='new'))
print (df1)
  Company   new
0       a  47.0
1       b  50.0
2       c  -9.0

或使用GroupBy.last：

df1 = (df.groupby('Company', as_index=False).last()
       .set_index('Company')
       .sum(axis=1)
       .reset_index(name='new'))

如果要使用对角线值，请使用numpy.diagonal：

s = df.drop_duplicates('Company', keep='last').set_index('Company')['Residue']

df = (df.drop('Residue', axis=1)
      .set_index('Company')
      .groupby('Company')
      .apply(lambda x: x.values.diagonal().sum())
      .add(s)
      .reset_index(name='new'))
print (df)
  Company   new
0       a  47.0
1       b  50.0
2       c  -8.0

上一个值为-8，因为-3 + -1 + -3 + -1。

关于python - Pandas 与groupby的对角线值，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/54195092/