我的想法是尝试并可视化来自fec网站的选举捐赠数据。基本上,我想创建一个堆积的条形图,其中X轴为州,Y轴为捐赠金额,“堆栈”为不同的候选人,显示每个候选人从每个州收到多少钱。

码:

import matplotlib.pyplot as plt
import pandas as pd
from pathlib import Path

pathName = r"R:\Downloads\indiv20\by_date"
dataDir = Path(pathName)
filename = "itcont_2020_20010425_20190425.txt"
fullName = dataDir / filename
data = pd.read_csv(fullName, low_memory=False, sep="|", usecols=[0, 9, 12, 14])

data.columns = ['Filer ID', 'State', 'Occupation', 'Donation Amount ($)']
data = data.dropna(subset=['Donation Amount ($)'])

donations_by_state = data.groupby('State').sum()

plt.bar(donations_by_state.index, donations_by_state['Donation Amount ($)'])
plt.ylabel('Donation Amount ($)')
plt.xlabel('State')
plt.title('Donations per State')

plt.show()


这可以绘制每个州的总捐款,效果很好。但是,当我尝试使用groupby方法对所有想要的数据进行分组时,我不确定如何从这些数据绘制堆叠的条形图:

donations_per_candidate_per_state = data['Donation Amount ($)'].groupby([data['State'], data['Filer ID']]).sum()

State  Filer ID
AA     C00005561      350
       C00010603      600
       C00042366      115
       C00309567     1675
       C00331694     2500
       C00365536      270
       C00401224     4495
       C00411330      100
       C00492991      300
       C00540500      300
       C00641381      250
       C00696948     2800
       C00697441      250
       C00699090       67
       C00703108     1400
AB     C00401224     1386
AE     C00000935      295
       C00003418      276
       C00010603     1750
       C00027466      320
       C00193433      105
       C00211037      251
       C00216614      226
       C00341396       20
       C00369033      150
       C00394957       50
       C00401224    26538
       C00438713       50
       C00457325      310
       C00492785      300
                    ...
ZZ     C00580100     1490
       C00603084       95
       C00607861      750
       C00608380      125
       C00618371     2199
       C00630665     1000
       C00632133      600
       C00632398      400
       C00639500      208
       C00639591     1450
       C00640623     6402
       C00653816     1000
       C00666149     1000
       C00666453     2800
       C00683102     1000
       C00689430     3524
       C00693234    13283
       C00693713     1000
       C00694018     2750
       C00694455    12761
       C00695510     1045
       C00696245      250
       C00696419     3000
       C00696526      500
       C00696948    31296
       C00697441    34396
       C00698050      350
       C00698258     2800
       C00699090     5757
       C00700732      475
Name: Donation Amount ($), Length: 32662, dtype: int64


似乎已经按照我需要的方式列出了数据,只是不确定如何绘制数据。

最佳答案

您可以按照here所述使用以下内容:

df = donations_per_candidate_per_state.unstack('Filer ID')
df.plot(kind='bar', stacked=True)

关于python - 带groupby的选举捐赠信息的堆叠条形图,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/59240217/

10-12 21:38