与数据帧中的前一行相比，如何识别一行中的字符串变化？

本文介绍了与数据帧中的前一行相比，如何识别一行中的字符串变化？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个来自熊猫的 DataFrame ：

import pandas as pd
inp = [{'Name': 'John', 'Year':2018, 'Address':'Beverly hills'}, {'Name': 'John', 'Year':2018, 'Address':'Beverly hills'}, {'Name': 'John', 'Year':2019, 'Address':'Beverly hills'}, {'Name': 'John', 'Year':2019, 'Address':'Orange county'}, {'Name': 'John', 'Year':2019, 'Address':'New York'}, {'Name': 'Steve', 'Year':2018, 'Address':'Canada'}, {'Name': 'Steve', 'Year':2019, 'Address':'Canada'}, {'Name': 'Steve', 'Year':2019, 'Address':'Canada'}, {'Name': 'Steve', 'Year':2020, 'Address':'California'}, {'Name': 'Steve', 'Year':2020, 'Address':'Canada'}]
df = pd.DataFrame(inp)
print (df)

如果与上一行相比，行的字符串值发生了变化，我想在单独的行 Cng-Address中进行标识，并且如果行的数值发生变化，则在 Cng-Year列中进行标识。如果没有更改，则将其标识为零。

If a change in a row's string value occurs comparing to previous row, I want to identify it in a separate row "Cng-Address", and if row's numeric value changes identify it in "Cng-Year" column. If there is no change identify it as zero.

索引为名称，这意味着应对与人员姓名相关的所有行进行上述计算。如果名称更改了（例如，约翰更改为史蒂夫），则应重置 Cng-Address和 Cng-Year的计算。列年份按升序排列。

The index is "Name" meaning that the above calculations should be done for all rows associated to person name. If a "Name" changes (i.e. John to Steve) then calculations for "Cng-Address" and "Cng-Year" should reset. Column year sorted ascending.

作为最终报告，我想获得：

As a final report I want to get:

约翰更改年份 1次并更改位置 2次

史蒂夫更改年份 2次并更改位置 2次

2019年的总更改地址是 2次

当前输出：

+-------+------+---------------+
| Name  | Year | Address       |
+-------+------+---------------+
| John  | 2018 | Beverly hills |
+-------+------+---------------+
| John  | 2018 | Beverly hills |
+-------+------+---------------+
| John  | 2019 | Beverly hills |
+-------+------+---------------+
| John  | 2019 | Orange county |
+-------+------+---------------+
| John  | 2019 | New York      |
+-------+------+---------------+
| Steve | 2018 | Canada        |
+-------+------+---------------+
| Steve | 2019 | Canada        |
+-------+------+---------------+
| Steve | 2019 | Canada        |
+-------+------+---------------+
| Steve | 2020 | California    |
+-------+------+---------------+
| Steve | 2020 | Canada        |
+-------+------+---------------+

理想的输出：

+-------+------+---------------+----------+-------------+
| Name  | Year | Address       | Cng-Year | Cng-Address |
+-------+------+---------------+----------+-------------+
| John  | 2018 | Beverly hills | 0        | 0           |
+-------+------+---------------+----------+-------------+
| John  | 2018 | Beverly hills | 0        | 0           |
+-------+------+---------------+----------+-------------+
| John  | 2019 | Beverly hills | 1        | 0           |
+-------+------+---------------+----------+-------------+
| John  | 2019 | Orange county | 0        | 1           |
+-------+------+---------------+----------+-------------+
| John  | 2019 | New York      | 0        | 1           |
+-------+------+---------------+----------+-------------+
| Steve | 2018 | Canada        | 0        | 0           |
+-------+------+---------------+----------+-------------+
| Steve | 2019 | Canada        | 1        | 0           |
+-------+------+---------------+----------+-------------+
| Steve | 2019 | Canada        | 0        | 0           |
+-------+------+---------------+----------+-------------+
| Steve | 2020 | California    | 1        | 1           |
+-------+------+---------------+----------+-------------+
| Steve | 2020 | Canada        | 0        | 1           |
+-------+------+---------------+----------+-------------+

CNG

与数据帧中的前一行相比，如何识别一行中的字符串变化？

问题描述

推荐答案