问题描述
所以我的数据框是由许多单独的excel文件组成的,每个文件都以日期为文件名,并以电子表格中当天的水果价格为准,因此电子表格的外观如下:
So my dataframe is made from lots of individual excel files, each with the the date as their file name and the prices of the fruits on that day in the spreadsheet, so the spreadsheets look something like this:
15012016:
Fruit Price
Orange 1
Apple 2
Pear 3
16012016:
Fruit Price
Orange 4
Apple 5
Pear 6
17012016:
Fruit Price
Orange 7
Apple 8
Pear 9
因此,将所有信息汇总在一起,我运行以下代码将所有信息汇总到数据帧字典中(所有存储在"C:\ Fruit_Prices_by_Day"中的水果价格文件
So to put all that information together I run the following code to put all the information into a dictionary of dataframes(all fruit price files stored in 'C:\Fruit_Prices_by_Day'
#find all the file names
file_list = []
for x in os.listdir('C:\Fruit_Prices_by_Day'):
file_list.append(x)
file_list= list(set(file_list))
d = {}
for date in Raw_list:
df1 = pd.read_excel(os.path.join('C:\Fruit_Prices_by_Day', date +'.xlsx'), index_col = 'Fruit')
d[date] = df1
然后这就是我所卡住的部分.然后,如何将这个dict变成一个数据列,其中的列名称是dict键,即日期,这样我就可以在同一数据框中获得每天每种水果的价格,例如:
Then this is the part where I'm stuck. How do I then make this dict into a dataframe where the column names are the dict keys i.e. the dates, so I can get the price of each fruit per day all in the same dataframe like:
15012016 16012016 17012016
Orange 1 4 7
Apple 2 5 8
Pear 3 6 9
推荐答案
您可以先尝试 set_index
comprehension
中的所有数据框,然后使用 concat
,并删除列中最后一级的multiindex
:
You can try first set_index
of all dataframes in comprehension
and then use concat
with remove last level of multiindex
in columns:
print d
{'17012016': Fruit Price
0 Orange 7
1 Apple 8
2 Pear 9, '16012016': Fruit Price
0 Orange 4
1 Apple 5
2 Pear 6, '15012016': Fruit Price
0 Orange 1
1 Apple 2
2 Pear 3}
d = { k: v.set_index('Fruit') for k, v in d.items()}
df = pd.concat(d, axis=1)
df.columns = df.columns.droplevel(-1)
print df
15012016 16012016 17012016
Fruit
Orange 1 4 7
Apple 2 5 8
Pear 3 6 9
这篇关于Python:如何将数据框字典转换为一个大数据框,而列名称是上一字典的关键?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!