我有一个包含名字,颜色,重量,大小,种子的水果数据集

         Fruit dataset

         Name     Colour    Weight  Size   Seeds   Unnamed

         Apple    Apple     Red     10.0   Big     Yes

         Apple    Apple     Red     5.0    Small   Yes

         Pear     Pear      Green   11.0   Big     Yes

         Banana   Banana    Yellow  4.0    Small   Yes

         Orange   Orange    Orange  5.0    Small   Yes

问题是,颜色列是名称的重复列,值向右移动1列,创建一个无用的列(未命名),其中包含属于列种子的值是否有一种简单的方法可以删除颜色中的重复值,并将其余列值从“权重”开始向左移1列。我希望我没有把这里的人弄糊涂。
渴望的结果
         Fruit dataset

         Name     Colour  Weight Size    Seeds   Unnamed(will be dropped)

         Apple    Red     10.0   Big     Yes

         Apple    Red     5.0    Small   Yes

         Pear     Green   11.0   Big     Yes

         Banana   Yellow  4.0    Small   Yes

         Orange   Orange  5.0    Small   Yes

最佳答案

你可以这样做:

In [23]: df
Out[23]:
     Name  Colour  Weight  Size  Seeds Unnamed
0   Apple   Apple     Red  10.0    Big     Yes
1   Apple   Apple     Red   5.0  Small     Yes
2    Pear    Pear   Green  11.0    Big     Yes
3  Banana  Banana  Yellow   4.0  Small     Yes
4  Orange  Orange  Orange   5.0  Small     Yes

In [24]: cols = df.columns[:-1]

In [25]: cols
Out[25]: Index(['Name', 'Colour', 'Weight', 'Size', 'Seeds'], dtype='object')

In [26]: df = df.drop('Colour', 1)

In [27]: df.columns = cols

In [28]: df
Out[28]:
     Name  Colour  Weight   Size Seeds
0   Apple     Red    10.0    Big   Yes
1   Apple     Red     5.0  Small   Yes
2    Pear   Green    11.0    Big   Yes
3  Banana  Yellow     4.0  Small   Yes
4  Orange  Orange     5.0  Small   Yes

08-24 13:16
查看更多