问题描述
我通过列名列表选择一个数据框的几列.如果列表中的所有元素都在数据框中,则此方法效果很好.但是,如果列表中的某些元素不在DataFrame中,则它将生成错误不在索引中".
I'm selecting several columns of a dataframe, by a list of the column names. This works fine if all elements of the list are in the dataframe.But if some elements of the list are not in the DataFrame, then it will generate the error "not in index".
是否有一种方法可以选择该列表中包括的所有列,即使该列表中的所有元素都没有包含在数据框中?这是一些产生上述错误的示例数据:
Is there a way to select all columns which included in that list, even if not all elements of the list are included in the dataframe? Here is some sample data which generates the above error:
df = pd.DataFrame( [[0,1,2]], columns=list('ABC') )
lst = list('ARB')
data = df[lst] # error: not in index
推荐答案
我认为您需要 Index.intersection
:
I think you need Index.intersection
:
df = pd.DataFrame({'A':[1,2,3],
'B':[4,5,6],
'C':[7,8,9],
'D':[1,3,5],
'E':[5,3,6],
'F':[7,4,3]})
print (df)
A B C D E F
0 1 4 7 1 5 7
1 2 5 8 3 3 4
2 3 6 9 5 6 3
lst = ['A','R','B']
print (df.columns.intersection(lst))
Index(['A', 'B'], dtype='object')
data = df[df.columns.intersection(lst)]
print (data)
A B
0 1 4
1 2 5
2 3 6
使用 numpy.intersect1d
的另一种解决方案> :
Another solution with numpy.intersect1d
:
data = df[np.intersect1d(df.columns, lst)]
print (data)
A B
0 1 4
1 2 5
2 3 6
这篇关于按列表选择列(列表是列的超集)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!