本文介绍了将Multible CSV文件合并到Pandas时在列中添加文件名-Python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在同一个文件夹中有多个具有相同数据列的CSV文件,

I have multiple csv files in the same folder with all the same data columns,

20100104 080100;5369;5378.5;5365;5378;2368
20100104 080200;5378;5385;5377;5384.5;652
20100104 080300;5384.5;5391.5;5383;5390;457
20100104 080400;5390.5;5391;5387;5389.5;392

我想将csv文件合并到pandas中,并在每行中添加一列带有文件名的文件,以便以后可以跟踪它的来源.似乎有类似的线程,但我无法适应任何解决方案.到目前为止,这就是我所拥有的.将数据合并到一个数据帧中是可行的,但是我被卡在了添加文件名列中,

I want to merge the csv files into pandas and add a column with the file name to each line so I can track where it came from later. There seems to be similar threads but I haven't been able to adapt any of the solutions. This is what I have so far. The merge data into one data frame works but I'm stuck on the adding file name column,

import os
import glob
import pandas as pd


path = r'/filepath/'
all_files = glob.glob(os.path.join(path, "*.csv"))
names = [os.path.basename(x) for x in glob.glob(path+'\*.csv')]

list_ = []
for file_ in all_files:
    list_.append(pd.read_csv(file_,sep=';', parse_dates=[0], infer_datetime_format=True,header=None ))

df = pd.concat(list_)

推荐答案

不是使用列表,而是使用 DataFrame的追加.

Instead of using a list just use DataFrame's append.

df = pd.DataFrame()
for file_ in all_files:
    file_df = pd.read_csv(file_,sep=';', parse_dates=[0], infer_datetime_format=True,header=None )
    file_df['file_name'] = file_
    df = df.append(file_df)

这篇关于将Multible CSV文件合并到Pandas时在列中添加文件名-Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-13 07:08