import pandas as pd
olympics = pd.read_csv('olympics.csv')

    Edition  NOC   Medal
0      1896  AUT  Silver
1      1896  FRA    Gold
2      1896  GER    Gold
3      1900  HUN  Bronze
4      1900  GBR    Gold
5      1900  DEN  Bronze
6      1900  USA    Gold
7      1900  FRA  Bronze
8      1900  FRA  Silver
9      1900  USA    Gold
10     1900  FRA  Silver
11     1900  GBR    Gold
12     1900  SUI  Silver
13     1900  ZZX    Gold
14     1904  HUN    Gold
15     1904  USA  Bronze
16     1904  USA    Gold
17     1904  USA  Silver
18     1904  CAN    Gold
19     1904  USA  Silver


我可以将数据框旋转为具有一些汇总值

pivot = olympics.pivot_table(index='Edition', columns='NOC', values='Medal', aggfunc='count')

NOC      AUT  CAN  DEN  FRA  GBR  GER  HUN  SUI  USA  ZZX
Edition
1896     1.0  NaN  NaN  1.0  NaN  1.0  NaN  NaN  NaN  NaN
1900     NaN  NaN  1.0  3.0  2.0  NaN  1.0  1.0  2.0  1.0
1904     NaN  1.0  NaN  NaN  NaN  NaN  1.0  NaN  4.0  NaN


我感兴趣的是拥有一个元组(三元组),其中(Na)的(#Gold,#Silver,#Bronze),(0,0,0)而不是值中的奖牌总数=

我该如何简洁优雅地做到这一点?

不需要使用数据透视表,因为数据透视表非常适合使用元组作为值

最佳答案

value_counts计算所有奖牌
为国家,日期,奖牌的所有组合创建多重索引
reindexfill_values=0




counts = df.groupby(['Edition', 'NOC']).Medal.value_counts()

mux = pd.MultiIndex.from_product(
    [c.values for c in counts.index.levels], names=counts.index.names)
counts = counts.reindex(mux, fill_value=0).unstack('Medal')
counts = counts[['Bronze', 'Silver', 'Gold']]

pd.Series([tuple(l) for l in counts.values.tolist()], counts.index).unstack()


python - Python Pandas的值等于特定列的简单函数-LMLPHP

08-19 21:54