python - 考虑到Pandas中的几个属性，删除重复项

我想排除那些标题和年份相同的实例。

     title      votes  ranking  year
0    Wonderland  19      7.9    1931
1    Wonderland  120     7.1    1997
2    Wonderland  3524    7.2    1999
3    Wonderland  18169   6.6    2003
4    Wonderland  17      8.7    2010
5    Wonderland  6       8.5    2012
6    Wonderland  8       7.4    2012

例如，在这种情况下。我只会删除5或6

最佳答案

您可以将drop_duplicates()与subset=参数一起使用。如果您的数据框名为df，则可以执行以下操作：

In [13]: df.drop_duplicates(subset=['title', 'year'])

将返回：

Out[13]:
        title  votes  ranking  year
0  Wonderland     19      7.9  1931
1  Wonderland    120      7.1  1997
2  Wonderland   3524      7.2  1999
3  Wonderland  18169      6.6  2003
4  Wonderland     17      8.7  2010
5  Wonderland      6      8.5  2012

请注意，您将从索引6中所包含的投票和排名中丢失任何唯一信息。

关于python - 考虑到Pandas中的几个属性，删除重复项，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/32342692/