问题描述
我正在为自己的学习提出这个问题.据我所知,以下是删除pandas数据框中的列的不同方法.
I am raising this question for my self learning. As far as I know, followings are the different methods to remove columns in pandas dataframe.
选项-1:
df=pd.DataFrame({'a':[1,2,3,4,5],'b':[6,7,8,9,10],'c':[11,12,13,14,15]})
del df['a']
选项-2:
df=pd.DataFrame({'a':[1,2,3,4,5],'b':[6,7,8,9,10],'c':[11,12,13,14,15]})
df=df.drop('a',1)
选项-3:
df=pd.DataFrame({'a':[1,2,3,4,5],'b':[6,7,8,9,10],'c':[11,12,13,14,15]})
df=df[['b','c']]
- 其中最好的方法是什么?
- 还有其他方法可以达到相同的目的吗?
推荐答案
按照 doc :
所以,我认为我们应该坚持使用df.drop
.为什么?我认为优点是:
So, I think we should stick with df.drop
. Why? I think the pros are:
-
它为我们提供了对删除操作的更多控制:
It gives us more control of the remove action:
# This will return a NEW DataFrame object, leave the original `df` untouched.
df.drop('a', axis=1)
# This will modify the `df` inplace. **And return a `None`**.
df.drop('a', axis=1, inplace=True)
它可以使用args处理更复杂的情况.例如.使用level
,我们可以处理MultiIndex删除.借助errors
,我们可以防止某些错误.
It can handle more complicated cases with it's args. E.g. with level
, we can handle MultiIndex deletion. And with errors
, we can prevent some bugs.
这是一种更加统一和面向对象的方式.
It's a more unified and object oriented way.
就像@jezrael在回答中指出的那样:
And just like @jezrael noted in his answer:
选项1:使用关键字del
是受限制的方式.
Option 1: Using key word del
is a limited way.
选项3:df=df[['b','c']]
本质上甚至不是删除.首先使用[]
语法通过索引选择数据,然后取消绑定用原始DataFrame命名df
并将其与新的DataFrame绑定(即df[['b','c']]
).
Option 3: And df=df[['b','c']]
isn't even a deletion in essence. It first select data by indexing with []
syntax, then unbind the name df
with the original DataFrame and bind it with the new one (i.e. df[['b','c']]
).
这篇关于删除 pandas 中的列的最佳方法是什么的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!