python - 如何在条件下遍历 Pandas 数据框并修改值？

I have this pandas dataframe:

df = pd.DataFrame(
    {
    "col1": [1,1,2,3,3,3,4,5,5,5,5]
    }
)
df

python - 如何在条件下遍历 Pandas 数据框并修改值？-LMLPHP

如果col1中的值不等于下一行中col1的值，我想添加另一列，该列显示“last”。它应该是这样的：

到目前为止，如果col1中的值不等于下一行中col1的值，我可以创建一个包含True的列；否则为False：

df["last_row"] = df["col1"].shift(-1)
df['last'] = df["col1"] != df["last_row"]
df = df.drop(["last_row"], axis=1)
df

现在有点像

df["last_row"] = df["col1"].shift(-1)
df['last'] = "last" if df["col1"] != df["last_row"]
df = df.drop(["last_row"], axis=1)
df

很好，但这显然是错误的语法。我怎么能做到这一点？
最后，我还想添加一些数字，表示一个值在此之前出现的时间，而最后一个值总是用“last”标记。It should look like this:

我不确定这是否是我发展的另一步，或者这是否需要一种新的方法。我读到，如果我想在修改值时遍历数组，应该使用apply（）。但是，我不知道如何在这里面包含条件。你能帮助我吗？
谢谢！

最佳答案

考虑到索引是递增的，（1）cuncount每个组，然后在每个组中取（2）max索引并设置字符串

group = df.groupby('col1')

df['last'] = group.cumcount()
df.loc[group['last'].idxmax(), 'last'] = 'last'
#or df.loc[group.apply(lambda x: x.index.max()), 'last'] = 'last'


    col1    last
0   1   0
1   1   last
2   2   last
3   3   0
4   3   1
5   3   last
6   4   last
7   5   0
8   5   1
9   5   2
10  5   last

关于python - 如何在条件下遍历 Pandas 数据框并修改值？，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/55870877/