我有以下熊猫数据框:
import pandas as pd
data = {"first_name": ["Alexander", "Alan", "Heather", "Marion", "Amy", "John"],
"last_name": ["Miller", "Jacobson", ".", "Milner", "Cooze", "Smith"],
"age": [42, 52, 36, 24, 73, 19],
"marriage_status" : [0, 0, 1, 1, 0, 1]}
df = pd.DataFrame(data)
df
age first_name last_name marriage_status
0 42 Alexander Miller 0
1 52 Alan Jacobson 0
2 36 Heather . 1
3 24 Marion Milner 1
4 73 Amy Cooze 0
5 19 John Smith 1
....
列
marriage_status
是二进制数据0和1的列。在每个1
之前,我还想将前面的行设为a1
。In this example, the dataframe would become: age first_name last_name marriage_status
0 42 Alexander Miller 0
1 52 Alan Jacobson 1 # this changed to 1
2 36 Heather . 1
3 24 Marion Milner 1
4 73 Amy Cooze 1 # this changed to 1
5 19 John Smith 1
....
换句话说,这个列中有连续的“组”,我想让前面的行元素1而不是0。我该怎么做?
我的想法是创建一个for语句,但这不是基于pandas的解决方案。人们也可以尝试
enumerate()
,但是我需要将前面的值设为1;如果不加上,我不确定这是如何工作的。 最佳答案
我们可以使用or
运算符|
。。1
当我们在一行中有一个True
并且在下一行中有一个0
时,evaluate toFalse
。
df.marriage_status = (
df.marriage_status | df.marriage_status.shift(-1)
).astype(int)
df
age first_name last_name marriage_status
0 42 Alexander Miller 0
1 52 Alan Jacobson 1
2 36 Heather . 1
3 24 Marion Milner 1
4 73 Amy Cooze 1
5 19 John Smith 1