


Fruits         Person        Eat

Banana         Peter         Yes
Banana         Ashley        Yes
Strawberry     Peter         No
Strawberry     Ashley        Yes
Cherry         Peter         Yes
Orange         Peter         No
Orange         Ashley        No
Grape          Ashley        Yes
Pear           Ashley        Yes
Pear           Peter         Yes


There are duplicate fruits in my data frame. I need to delete the duplicates based on the following logic. If there is a duplicate fruit and Peter and Ashley both eat it, then Peter's row is kept and Ashley's row is deleted. If there is a duplicate fruit and Peter doesn't eat it and Ashley eats it, then Peter's row is deleted and Ashley's row remains. If there is a duplicate fruit and Peter doesn't eat it and Ashley doesn't eat it, then both rows are deleted.


With this logic the data frame should output like:

Fruits         Person        Eat

Banana         Peter         Yes
Strawberry     Ashley        Yes
Cherry         Peter         Yes
Grape          Ashley        Yes
Pear           Peter         Yes


I'm not sure how to iterate through a pandas data frame with these conditions to delete duplicates. Generally, for the first condition I would do something like this:

data = [
        "fruit": "Apple",
        "person": "Ashley",
        "eats": True
        "fruit": "Apple",
        "person": "Peter",
        "eats": True
eats = dict()

for i, row in enumerate(data):
    fruit = row["fruit"]
person = row["person"]
does_eat = row["eats"]
# mark whether person eats fruit
if not eats.get(person):
    eats[person] = dict()

# if person does eat, record row number for later deletion if needed if does_eat:
eats[person][fruit] = i

# dedup
if person == "Peter" and eats.get("Peter") and eats["Peter"].get(fruit):
elif person == "Ashley" and eats.get("Peter") and eats["Peter"].get(fruit):


Any help/tips on how to do this with my data frame would be very appreciated.



df1 = (df[df.Eat.eq('Yes')].sort_values('Person')
                           .drop_duplicates(subset='Fruits', keep='last'))

       Fruits  Person  Eat
3  Strawberry  Ashley  Yes
7       Grape  Ashley  Yes
0      Banana   Peter  Yes
4      Cherry   Peter  Yes
9        Pear   Peter  Yes


09-11 10:49