如何将类似的列名称组合到Pandas中的单独

如何将类似的列名称组合到Pandas中的单独

本文介绍了如何将类似的列名称组合到Pandas中的单独行中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我具有以下数据,并且将其读入,则会获得.1或.2的列名作为类似的列.这是数据:

If I have the following data, and read it in, I get column names with .1 or .2 for like columns. Here is the data:

import io
dfff=io.StringIO("""address,phone,name,website,type,address,phone,name,website,type,address,phone,name,type
123 APPLE STREET,555-5555,APPLE STORE,APPLE.COM,BUSINESS,456 peach ave,777-7777,PEACH STORE,PEACH.COM,BUSINESS,789 banana rd,999-9999,banana store,BUSINESS""")

dfff=io.StringIO("""address,phone,name,website,type,address,phone,name,website,type,address,phone,name,type
123 APPLE STREET,555-5555,APPLE STORE,APPLE.COM,BUSINESS,456 peach ave,777-7777,PEACH STORE,PEACH.COM,BUSINESS,789 banana rd,999-9999,banana store,BUSINESS""")
dfff.seek(0)
newdf2=pd.read_csv(dfff)

这是输出,pandas将列重命名为具有相似列名的.1或.2.

Here is the output, pandas renames the columns to have .1 or .2 for similar column names.

newdf2
#            address     phone         name    website      type      address.1   phone.1       name.1  website.1    type.1      address.2   phone.2        name.2    type.2
#0  123 APPLE STREET  555-5555  APPLE STORE  APPLE.COM  BUSINESS  456 peach ave  777-7777  PEACH STORE  PEACH.COM  BUSINESS  789 banana rd  999-9999  banana store  BUSINESS

如何将类似地址行合并到单独的行中,以获得此输出(由于没有website.2,它将为NaN或0或空白):

How do I combine like address lines into seperate rows, to get this output ( since there is no website.2, it would be NaN or 0 or blank):

#            address     phone         name    website      type
#0  123 APPLE STREET  555-5555  APPLE STORE  APPLE.COM  BUSINESS
#1     456 peach ave  777-7777  PEACH STORE  PEACH.COM  BUSINESS
#2     789 banana rd  999-9999  banana store       NaN  BUSINESS

现在,我真的没有从哪里开始,但是我尝试堆叠数据,该数据可以按预期工作,但是拆栈只会恢复到原始数据:

Now, i don't really no where to start, but i tried to stack the data, that works as expected, but unstacking just brings back to the original data:

newdf2.stack().to_frame()
#                            0
#0 address    123 APPLE STREET
#  phone              555-5555
#  name            APPLE STORE
#  website           APPLE.COM
#  type               BUSINESS
#  address.1     456 peach ave
#  phone.1            777-7777
#  name.1          PEACH STORE
#  website.1         PEACH.COM
#  type.1             BUSINESS
#  address.2     789 banana rd
#  phone.2            999-9999
#  name.2         banana store
#  type.2             BUSINESS

我在想必须有一种方法可以堆叠,从列中删除.,然后堆叠为我想要的格式?也许还有另一种方法?

I'm thinking there must be a way to stack, remove the .'s from the column, and unstack into the format i want? Or maybe there is another way?

推荐答案

您可以使用wide_to_long.

You can use wide_to_long.

df.columns = [f'{x}.0' if '.' not in x else x for x in df.columns]
df['id'] = df.index

df = pd.wide_to_long(df, stubnames=['address', 'phone', 'name', 'website', 'type'], i='id', j='row', sep='.')

df.reset_index(drop=True)

Out[1]:
            address     phone          name    website      type
0  123 APPLE STREET  555-5555   APPLE STORE  APPLE.COM  BUSINESS
1     456 peach ave  777-7777   PEACH STORE  PEACH.COM  BUSINESS
2     789 banana rd  999-9999  banana store        NaN  BUSINESS

这篇关于如何将类似的列名称组合到Pandas中的单独行中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-02 23:20