将多个excel文件导入python pandas并将它们连接成一个数据帧

本文介绍了将多个excel文件导入python pandas并将它们连接成一个数据帧的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想从一个目录中读取几个 excel 文件到 Pandas 中，并将它们连接成一个大数据框.我一直无法弄清楚.我需要一些有关 for 循环和构建连接数据框的帮助:这是我到目前为止所拥有的:

I would like to read several excel files from a directory into pandas and concatenate them into one big dataframe. I have not been able to figure it out though. I need some help with the for loop and building a concatenated dataframe:Here is what I have so far:

import sys
import csv
import glob
import pandas as pd

# get data file names
path =r'C:DRODCL_rawdata_filesexcelfiles'
filenames = glob.glob(path + "/*.xlsx")

dfs = []

for df in dfs:
    xl_file = pd.ExcelFile(filenames)
    df=xl_file.parse('Sheet1')
    dfs.concat(df, ignore_index=True)

推荐答案

正如评论中提到的，你犯的一个错误是你在一个空列表上循环.

As mentioned in the comments, one error you are making is that you are looping over an empty list.

下面是我将如何做到这一点，使用一个示例，将 5 个相同的 Excel 文件一个接一个地附加.

Here is how I would do it, using an example of having 5 identical Excel files that are appended one after another.

(1) 进口:

import os
import pandas as pd

(2) 列表文件:

path = os.getcwd()
files = os.listdir(path)
files

输出:

['.DS_Store',
 '.ipynb_checkpoints',
 '.localized',
 'Screen Shot 2013-12-28 at 7.15.45 PM.png',
 'test1 2.xls',
 'test1 3.xls',
 'test1 4.xls',
 'test1 5.xls',
 'test1.xls',
 'Untitled0.ipynb',
 'Werewolf Modelling',
 '~$Random Numbers.xlsx']

(3) 选择 'xls' 文件:

files_xls = [f for f in files if f[-3:] == 'xls']
files_xls

输出:

['test1 2.xls', 'test1 3.xls', 'test1 4.xls', 'test1 5.xls', 'test1.xls']

(4) 初始化空数据帧:

df = pd.DataFrame()

(5) 循环文件列表以附加到空数据帧:

for f in files_xls:
    data = pd.read_excel(f, 'Sheet1')
    df = df.append(data)

(6) 享受您的新数据框.:-)

df

输出:

  Result  Sample
0      a       1
1      b       2
2      c       3
3      d       4
4      e       5
5      f       6
6      g       7
7      h       8
8      i       9
9      j      10
0      a       1
1      b       2
2      c       3
3      d       4
4      e       5
5      f       6
6      g       7
7      h       8
8      i       9
9      j      10
0      a       1
1      b       2
2      c       3
3      d       4
4      e       5
5      f       6
6      g       7
7      h       8
8      i       9
9      j      10
0      a       1
1      b       2
2      c       3
3      d       4
4      e       5
5      f       6
6      g       7
7      h       8
8      i       9
9      j      10
0      a       1
1      b       2
2      c       3
3      d       4
4      e       5
5      f       6
6      g       7
7      h       8
8      i       9
9      j      10

这篇关于将多个excel文件导入python pandas并将它们连接成一个数据帧的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！