这是此SO问题的后续操作:Concatenate several rows into one row by column value, and split resulting dataframe into several dataframes based on number of concatinated rows
该示例显示了在要合并一列和另外一列的情况下如何合并行。
我现在正在寻找一种解决方案,该方法有许多列,但我仍然想基于一个列合并行。
我的处理方式是:首先列出一种类型的所有列,然后以与第一次相同的顺序列出另一种类型的列。
这是一个最小的例子
data = [['tom', 'ca', 2], ['ni2ck', 'ma', 2], ['j3uli', 'ny', 4] , ['nic4k', 'ma', 4], ['jul5i', 'ny', 4] , ['nic6k', 'ma', 7], ['ju7li', 'ny', 7] , ['nic8k', 'ma', 7], ['ju9li', 'ny', 7] , ['nic1k', 'ma', 8], ['car', 'ny', 8]]
df = pd.DataFrame(data, columns = ['Name', 'Location', 'Age'])
df
结果是
Name Location Age
0 tom ca 2
1 ni2ck ma 2
2 j3uli ny 4
3 nic4k ma 4
4 jul5i ny 4
5 nic6k ma 7
6 ju7li ny 7
7 nic8k ma 7
8 ju9li ny 7
9 nic1k ma 8
10 car ny 8
这将是理想的结果
Name Name Location Location Age
0 tom ni2ck ca ma 2
1 nic1k car ma ny 8
Name Name Name Location Location Location Age
0 j3uli nic4k jul5i ny ma ny 4
Name Name Name Name Location Location Location Location Age
0 nic6k ju7li nic8k ju9li ma ny ma ny 7
重要的是,正确的位置应与相应名称的顺序相同。
最佳答案
从@Wen解决方案进行开发。代替pivot
,使用pivot_table
df['New']=df.groupby('Age').cumcount()
s= df.pivot_table(index='Age',columns='New',
values=['Name', 'Location'],
aggfunc='first').reindex(['Name', 'Location'], axis=1, level=0)
s.columns = s.columns.map('{0[0]}{0[1]}'.format)
l=[y.dropna(1).reset_index() for _ , y in s.groupby(s.isnull().sum(1))]
In [499]: l[0]
Out[499]:
Age Name0 Name1 Name2 Name3 Location0 Location1 Location2 Location3
0 7 nic6k ju7li nic8k ju9li ma ny ma ny
In [500]: l[1]
Out[500]:
Age Name0 Name1 Name2 Location0 Location1 Location2
0 4 j3uli nic4k jul5i ny ma ny
In [501]: l[2]
Out[501]:
Age Name0 Name1 Location0 Location1
0 2 tom ni2ck ca ma
1 8 nic1k car ma ny
如果要保留多索引列,请在列上跳过
map
命令df['New']=df.groupby('Age').cumcount()
s= df.pivot_table(index='Age',columns='New',
values=['Name', 'Location'],
aggfunc='first').reindex(['Name', 'Location'], axis=1, level=0)
l=[y.dropna(1).reset_index() for _ , y in s.groupby(s.isnull().sum(1))]
In [544]: l[0]
Out[544]:
Age Name Location
New 0 1 2 3 0 1 2 3
0 7 nic6k ju7li nic8k ju9li ma ny ma ny
In [545]: l[1]
Out[545]:
Age Name Location
New 0 1 2 0 1 2
0 4 j3uli nic4k jul5i ny ma ny
In [546]: l[2]
Out[546]:
Age Name Location
New 0 1 0 1
0 2 tom ni2ck ca ma
1 8 nic1k car ma ny