问题描述
我正在通过列名列表选择数据框的几列.如果列表的所有元素都在数据框中,这可以正常工作.但是如果列表的某些元素不在 DataFrame 中,则会生成错误不在索引中".
I'm selecting several columns of a dataframe, by a list of the column names. This works fine if all elements of the list are in the dataframe.But if some elements of the list are not in the DataFrame, then it will generate the error "not in index".
有没有办法选择包含在该列表中的所有列,即使不是列表中的所有元素都包含在数据框中?以下是产生上述错误的一些示例数据:
Is there a way to select all columns which included in that list, even if not all elements of the list are included in the dataframe? Here is some sample data which generates the above error:
df = pd.DataFrame( [[0,1,2]], columns=list('ABC') )
lst = list('ARB')
data = df[lst] # error: not in index
推荐答案
我认为你需要 Index.intersection
:
I think you need Index.intersection
:
df = pd.DataFrame({'A':[1,2,3],
'B':[4,5,6],
'C':[7,8,9],
'D':[1,3,5],
'E':[5,3,6],
'F':[7,4,3]})
print (df)
A B C D E F
0 1 4 7 1 5 7
1 2 5 8 3 3 4
2 3 6 9 5 6 3
lst = ['A','R','B']
print (df.columns.intersection(lst))
Index(['A', 'B'], dtype='object')
data = df[df.columns.intersection(lst)]
print (data)
A B
0 1 4
1 2 5
2 3 6
另一种使用 numpy.intersect1d
:
Another solution with numpy.intersect1d
:
data = df[np.intersect1d(df.columns, lst)]
print (data)
A B
0 1 4
1 2 5
2 3 6
这篇关于按列表选择列(列是列表的子集)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!