我正在尝试做的一个简单说明:给定一组工资单数据,其中有regular
,over_time
,double_time
,lunch_break
列,我想从另一个中减去lunch_break
列时间列,并以指定的顺序进行操作,直到用完午餐休息时间为止。例如,lunch_break
分钟应首先从regular
,然后是over_time
,然后是double_time
。因此,给出以下数据集:
import pandas as pd
payroll = [
{'regular': 120, 'over_time': 60, 'double_time': 0, 'lunch_break': 30},
{'regular': 15, 'over_time': 60, 'double_time': 30, 'lunch_break': 45},
{'regular': 15, 'over_time': 15, 'double_time': 120, 'lunch_break': 45},
{'regular': 0, 'over_time': 120, 'double_time': 120, 'lunch_break': 30}
]
payroll_df = pd.DataFrame(payroll)
我需要以下结果:
result = [
{'regular': 90, 'over_time': 60, 'double_time': 0}, # 30 from reg
{'regular': 0, 'over_time': 30, 'double_time': 30}, # 15 from reg, 30 from ovr
{'regular': 0, 'over_time': 0, 'double_time': 105}, # 15 from reg, 15 from ovr, 15 from dbl
{'regular': 0, 'over_time': 90, 'double_time': 120}, # 0 from reg, 30 from ovr
]
result_df = pd.DataFrame(result)
是否有使用熊猫的好方法?
最佳答案
向量化版本
df = payroll_df.copy()
df['regular'] = df.regular - df['lunch_break']
df.loc[df.regular < 0, 'over_time'] += df[df.regular < 0].regular
df.loc[df.over_time < 0, 'double_time'] += df[df.over_time < 0].over_time
df[df < 0] = 0
print(df.drop(columns='lunch_break'))
regular over_time double_time
0 90 60 0
1 0 30 30
2 0 0 105
3 0 90 120
关于python - Pandas -通过优先级减去列,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/59780940/