本文介绍了保留列顺序-Python Pandas和列Concat的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,我的google-fu似乎并没有使我看起来应该是一个微不足道的程序.

So my google-fu doesn't seem to be doing me justice with what seems like should be a trivial procedure.

在适用于Python的Pandas中,我有2个数据集,我想将它们合并.使用.concat可以正常工作.问题是,.concat对我的列进行重新排序.从数据检索的角度来看,这是微不足道的.从我只想打开文件并快速查看最重要的列"的角度来看,这很烦人.

In Pandas for Python I have 2 datasets, I want to merge them. This works fine using .concat. The issue is, .concat reorders my columns. From a data retrieval point of view, this is trivial. From a "I just want to open the file and quickly see the most important column" point of view, this is annoying.

File1.csv
Name    Username    Alias1
Tom     Tomfoolery   TJZ
Meryl   MsMeryl      Mer
Timmy   Midsize      Yoda

File2.csv
Name    Username   Alias 1   Alias 2
Bob     Firedbob   Fire      Gingy
Tom     Tomfoolery  TJZ      Awww

Result.csv
    Alias1 Alias2   Name    Username
0   TJZ    NaN      Tom     Tomfoolery
1   Mer    NaN      Meryl   MsMeryl
2   Yoda   NaN      Timmy   Midsize
0   Fire   Gingy    Bob     Firedbob
1   TJZ    Awww     Tom     Tomfoolery

结果很好,但是在我正在使用的数据文件中,我有1,000列.现在最重要的2-3个位于中间.有没有办法,在这个玩具示例中,我可以将用户名"强制为第一列,而将名称"强制为第二列,显然将每个以下的值都保留下来.

The result is fine, but in the data-file I'm working with I have 1,000 columns. The 2-3 most important are now in the middle. Is there a way, in this toy example, I could've forced "Username" to be the first column and "Name" to be the second column, preserving the values below each all the way down obviously.

此外,当我保存到文件时,它还将该编号保存在侧面(0 1 2 0 1).如果也有办法防止这种情况发生,那将很酷.如果不是这样,那么就没什么大不了的,因为可以快速删除.

Also as a side note, when I save to file it also saves that numbering on the side (0 1 2 0 1). If theres a way to prevent that too, that'd be cool. If not, its not a big deal since it's a quick fix to remove.

谢谢!

推荐答案

假设串联的DataFrame为df,则可以按以下方式对列进行重新排序:

Assuming the concatenated DataFrame is df, you can perform the reordering of columns as follows:

important = ['Username', 'Name']
reordered = important + [c for c in df.columns if c not in important]
df = df[reordered]
print df

输出:

     Username   Name Alias1 Alias2
0  Tomfoolery    Tom    TJZ    NaN
1     MsMeryl  Meryl    Mer    NaN
2     Midsize  Timmy   Yoda    NaN
0    Firedbob    Bob   Fire  Gingy
1  Tomfoolery    Tom    TJZ   Awww

数字列表[0, 1, 2, 0, 1]是DataFrame的索引.为了防止将它们写入输出文件,可以在to_csv()中使用index=False选项:

The list of numbers [0, 1, 2, 0, 1] is the index of the DataFrame. To prevent them from being written to the output file, you can use the index=False option in to_csv():

df.to_csv('Result.csv', index=False, sep=' ')

这篇关于保留列顺序-Python Pandas和列Concat的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-01 20:39