如何合并两个连续的行并形成一个新列

如何合并两个连续的行并形成一个新列

本文介绍了如何合并两个连续的行并形成一个新列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 DF(从会计软件收集),看起来像这样.

串行||日期 ||详情 ||价格-------------------------------1 ||第0308章安德鲁||1002 ||南||手套 ||NaN3 ||第0408章约翰逊||504 ||南||检票口 ||NaN

我想合并连续的 2 行并创建一个新列产品",第二行详细信息"值.预期的输出应该类似于 ---

串行||日期 ||详情 ||价格 ||产品-------------------------------------------------1 ||第0308章安德鲁||100 ||手套3 ||第0408章约翰逊||50 ||便门

如何使用 Pandas 实现这一点?

解决方案

这些答案基于数据帧的格式,总是呈现遵循 OP 呈现的相同模式的行对.第一行显示一个人,第二行显示一个产品和日期,价格列是 NaN.

使用shift然后dropna

df.assign(Product=df.Particulars.shift(-1)).dropna()系列 日期 详情 价格 产品0 1 308.0 安德鲁 10​​0.0 手套2 3 408.0 约翰逊 50.0 检票口

加入

完全相同但不同的东西

df.join(df.Particulars.shift(-1).rename('Product')).dropna()

详情

每个请求

  • df.Particulars.shift(-1) 将 Particulars 列的所有成员返回一行

    0 手套1 约翰逊2 检票口3 南名称:详细信息,dtype:对象

  • 当我将其分配给现有数据框 df.assign(Product=df.Particulars.shift(-1)) 时,它会添加一个具有新名称 'Product 的列' 此处的值是移动的详细信息.

     系列 日期 详情 价格 产品0 1 308.0 安德鲁 10​​0.0 手套1 2 NaN 手套 NaN Johnson2 3 408.0 约翰逊 50.0 检票口3 4 NaN 检票口 NaN NaN

  • 剩下的就是删除带有 NaN 值的行,我们就有了上面显示的内容.

灵感来自 @QuangHoang 的回答

如果我每隔一行切片,我就不需要依赖 dropna

df.assign(Product=df.Particulars.shift(-1))[::2]

或者更简洁

df[::2].assign(Product=[*df.Particulars[1::2]])

一种方法

这是我想到的第一种方式,而且很恶心

i = np.flatnonzero(df.Price.notna())j = i + 1df.iloc[i].assign(Product=df.iloc[j].Particulars.values)系列 日期 详情 价格 产品0 1 308.0 安德鲁 10​​0.0 手套2 3 408.0 约翰逊 50.0 检票口

I have a DF(collected from an accounting software) which looks like this.


    Serial || Date || Particulars || Price
    --------------------------------------
      1    || 0308 || Andrew      || 100
      2    || NaN  || Gloves      || NaN
      3    || 0408 || Johnson     || 50
      4    || NaN  || Wicket      || NaN

I want to merge the 2 consecutive rows and make a new column 'Product' with 2nd rows 'Particulars' value.The expected output should look like ---

    Serial || Date || Particulars || Price || Product
    -------------------------------------------------
      1    || 0308 || Andrew      || 100   || Gloves
      3    || 0408 || Johnson     || 50    || Wicket

How do I achieve this with pandas?

解决方案

These answers are predicated on the format of the dataframe always presenting pairs of rows that follow the same pattern presented by OP. First row shows a person, second row shows a product and date, price columns are NaN.

Use shift then dropna

df.assign(Product=df.Particulars.shift(-1)).dropna()

   Serial   Date Particulars  Price Product
0       1  308.0      Andrew  100.0  Gloves
2       3  408.0     Johnson   50.0  Wicket


join

Same exact thing but different

df.join(df.Particulars.shift(-1).rename('Product')).dropna()


Details

Per Request

  • df.Particulars.shift(-1) brings all members of the Particulars column back one row

    0     Gloves
    1    Johnson
    2     Wicket
    3        NaN
    Name: Particulars, dtype: object
    

  • When I assign this to the existing dataframe df.assign(Product=df.Particulars.shift(-1)) it adds a column with a new name 'Product' where the values are the shifted Particulars.

       Serial   Date Particulars  Price  Product
    0       1  308.0      Andrew  100.0   Gloves
    1       2    NaN      Gloves    NaN  Johnson
    2       3  408.0     Johnson   50.0   Wicket
    3       4    NaN      Wicket    NaN      NaN
    

  • All that's left is to drop the rows withe the NaN values and we have what is presented above.


Inspired by @QuangHoang's answer

I don't need to depend on dropna if I slice every other row

df.assign(Product=df.Particulars.shift(-1))[::2]

Or even more terse

df[::2].assign(Product=[*df.Particulars[1::2]])


One way to do it

This was the first way I thought of and it's gross

i = np.flatnonzero(df.Price.notna())
j = i + 1

df.iloc[i].assign(Product=df.iloc[j].Particulars.values)

   Serial   Date Particulars  Price Product
0       1  308.0      Andrew  100.0  Gloves
2       3  408.0     Johnson   50.0  Wicket

这篇关于如何合并两个连续的行并形成一个新列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-21 11:53