问题描述
我正在寻找一种将列名列表附加到pandas
中的DataFrame中现有列名的方式,然后按col_start
+ col_add
对其重新排序.
DataFrame已经包含来自col_start
的列.
类似的东西:
import pandas as pd
df = pd.read_csv(file.csv)
col_start = ["col_a", "col_b", "col_c"]
col_add = ["Col_d", "Col_e", "Col_f"]
df = pd.concat([df,pd.DataFrame(columns = list(col_add))]) #Add columns
df = df[[col_start.extend(col_add)]] #Rearrange columns
还有,是否有一种方法可以将col_start
中每个项目的首字母大写,类似于title()
或capitalize()
?
您的代码就快要存在了,有几件事:
df = pd.concat([df,pd.DataFrame(columns = list(col_add))])
可以简化为这样,因为col_add
已经是一个列表:
df = pd.concat([df,pd.DataFrame(columns = col_add)])
此外,您也可以将2个列表加在一起,这样:
df = df[[col_start.extend(col_add)]]
成为
df = df[col_start+col_add]
要大写列表中的第一个字母,只需执行以下操作:
In [184]:
col_start = ["col_a", "col_b", "col_c"]
col_start = [x.title() for x in col_start]
col_start
Out[184]:
['Col_A', 'Col_B', 'Col_C']
编辑
为避免大写的列名上出现KeyError
,您需要在调用concat
之后大写,这些列具有向量化的str
title
方法:
In [187]:
df = pd.DataFrame(columns = col_start + col_add)
df
Out[187]:
Empty DataFrame
Columns: [col_a, col_b, col_c, Col_d, Col_e, Col_f]
Index: []
In [188]:
df.columns = df.columns.str.title()
df.columns
Out[188]:
Index(['Col_A', 'Col_B', 'Col_C', 'Col_D', 'Col_E', 'Col_F'], dtype='object')
I'm looking for a way to append a list of column names to existing column names in a DataFrame in pandas
and then reorder them by col_start
+ col_add
.
The DataFrame already contains the columns from col_start
.
Something like:
import pandas as pd
df = pd.read_csv(file.csv)
col_start = ["col_a", "col_b", "col_c"]
col_add = ["Col_d", "Col_e", "Col_f"]
df = pd.concat([df,pd.DataFrame(columns = list(col_add))]) #Add columns
df = df[[col_start.extend(col_add)]] #Rearrange columns
Also, is there a way to capitalize the first letter for each item in col_start
, analogous to title()
or capitalize()
?
Your code is nearly there, a couple things:
df = pd.concat([df,pd.DataFrame(columns = list(col_add))])
can be simplified to just this as col_add
is already a list:
df = pd.concat([df,pd.DataFrame(columns = col_add)])
Also you can also just add 2 lists together so:
df = df[[col_start.extend(col_add)]]
becomes
df = df[col_start+col_add]
And to capitalise the first letter in your list just do:
In [184]:
col_start = ["col_a", "col_b", "col_c"]
col_start = [x.title() for x in col_start]
col_start
Out[184]:
['Col_A', 'Col_B', 'Col_C']
EDIT
To avoid the KeyError
on the capitalised column names, you need to capitalise after calling concat
, the columns have a vectorised str
title
method:
In [187]:
df = pd.DataFrame(columns = col_start + col_add)
df
Out[187]:
Empty DataFrame
Columns: [col_a, col_b, col_c, Col_d, Col_e, Col_f]
Index: []
In [188]:
df.columns = df.columns.str.title()
df.columns
Out[188]:
Index(['Col_A', 'Col_B', 'Col_C', 'Col_D', 'Col_E', 'Col_F'], dtype='object')
这篇关于 pandas 将列表追加到列名称列表中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!