python - 在多索引列上设置行索引并查询 Pandas 数据框

从具有以下类似多维列标题结构的pandas数据帧开始，有一种方法可以转换Area Names和Area Codes标题，使它们跨越每个级别（即，单个Area Names和跨越多个列标题行的标签？如果是这样，我该如何在该列上运行查询以仅返回对应于特定值（例如，区号E06000047）或2012/13年度英国的低值和非常高值的行？我想知道，根据区号或区域名称或两列行索引Area Codes定义行索引是否更容易。如果是这样，该如何从当前表中执行此操作？ ['*Area Code*', '*Area Names*']似乎对使用当前结构不满意？创建上面的代码片段：import pandas as pddf= pd.DataFrame({('2011/12*', 'High', '7-8'): {3: 49.83, 5: 50.01, 7: 48.09, 8: 43.58, 9: 44.19}, ('2011/12*', 'Low', '0-4'): {3: 6.51, 5: 6.53, 7: 6.49, 8: 6.41, 9: 6.12}, ('2011/12*', 'Medium', '5-6'): {3: 17.44, 5: 17.59, 7: 18.11, 8: 19.23, 9: 20.01}, ('2011/12*', 'Very High', '9-10'): {3: 26.22, 5: 25.87, 7: 27.32, 8: 30.78, 9: 29.68}, ('2012/13*', 'High', '7-8'): {3: 51.16, 5: 51.35, 7: 48.47, 8: 44.67, 9: 49.39}, ('2012/13*', 'Low', '0-4'): {3: 5.71, 5: 5.74, 7: 6.73, 8: 8.42, 9: 6.51}, ('2012/13*', 'Medium', '5-6'): {3: 17.1, 5: 17.29, 7: 18.46, 8: 20.23, 9: 15.81}, ('2012/13*', 'Very High', '9-10'): {3: 26.03, 5: 25.62, 7: 26.34, 8: 26.68, 9: 28.3}, ('Area Codes', 'Area Codes', 'Area Codes'): {3: 'K02000001', 5: 'E92000001', 7: 'E12000001', 8: 'E06000047', 9: 'E06000005'}, ('Area Names', 'Area Names', 'Area Names'): {3: 'UNITED KINGDOM', 5: 'ENGLAND', 7: 'NORTH EAST', 8: 'County Durham', 9: 'Darlington'}}) 最佳答案我认为您需要带有set_index设置的元组的MultiIndex：df.set_index([('Area Codes','Area Codes','Area Codes'), ('Area Names','Area Names','Area Names')], inplace=True)df.index.names = ['Area Codes','Area Names']print (df) 2011/12* 2012/13* \ High Low Medium Very High High Low 7-8 0-4 5-6 9-10 7-8 0-4Area Codes Area NamesK02000001 UNITED KINGDOM 49.83 6.51 17.44 26.22 51.16 5.71E92000001 ENGLAND 50.01 6.53 17.59 25.87 51.35 5.74E12000001 NORTH EAST 48.09 6.49 18.11 27.32 48.47 6.73E06000047 County Durham 43.58 6.41 19.23 30.78 44.67 8.42E06000005 Darlington 44.19 6.12 20.01 29.68 49.39 6.51 Medium Very High 5-6 9-10Area Codes Area NamesK02000001 UNITED KINGDOM 17.10 26.03E92000001 ENGLAND 17.29 25.62E12000001 NORTH EAST 18.46 26.34E06000047 County Durham 20.23 26.68E06000005 Darlington 15.81 28.30然后需要sort_index，因为： KeyError：“ MultiIndex Slicing要求索引完全按照lexsorted元组len（2），lexsort深度（0）进行编码”df.sort_index(inplace=True)上次使用时按slicers选择：idx = pd.IndexSliceprint (df.loc[idx['E06000047',:], :]) 2011/12* 2012/13* \ High Low Medium Very High High Low 7-8 0-4 5-6 9-10 7-8 0-4Area Codes Area NamesE06000047 County Durham 43.58 6.41 19.23 30.78 44.67 8.42 Medium Very High 5-6 9-10Area Codes Area NamesE06000047 County Durham 20.23 26.68print (df.loc[idx[:,'ENGLAND'], idx['2012/13*',['Low','Very High']]]) 2012/13* Low Very High 0-4 9-10Area Codes Area NamesE92000001 ENGLAND 5.74 25.62关于python - 在多索引列上设置行索引并查询 Pandas 数据框，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/39745627/

area

python - 在多索引列上设置行索引并查询 Pandas 数据框