问题描述
我有一个包含文件名的列表。我想解析目录并从列表中的每个元素开始读取所有文件,并将其存储在数据框中
I have a list that contains file names. I want to parse directory and read all the files starting with every element from list and store it in dataframe
例如:
list1=[abc,bcd,def]
目录:
abc1.txt
abc2.txt
abc3.txt
bcd1.txt
bcd2.txt
bcd3.txt
输出应是这样的,以'abc'开头的文件应该在一个pandas数据框中,而以'bcd'开头的文件应该在其他数据框中等
The output should be such that Files starting with 'abc' should be in one pandas dataframe and files starting with 'bcd' in other dataframe etc
我的代码:
dfs = []
for exp in expnames:
for files in filenames:
if files.startswith(exp):
dfs.append(pd.read_csv(file_path+files,sep=',',header=None))
big_frame = pd.concat(dfs, ignore_index=True)
推荐答案
这将创建一个 DataFrames $的字典c $ c>,其中每个
DataFrame
都包含所有文件将我们的表达式的前三个字母(即 abc
, def
等)。字典中的键是相同的三个字母:
This will create a dictionary of DataFrames
where each DataFrame
consists of all files matching the first three letters of our "expressions" (i.e. abc
, def
et.c.). The keys in the dictionary are the same three letters:
# Some dummy data
filenames = ['abcdefghijkl.txt', 'abcdef.txt', 'defghijk.txt']
# List of combination of certain letters
exps = ['abc', 'def', 'ghi', 'jkl']
dataframes = {}
for filename in filenames:
_df = pd.read_csv(filename)
key = exps[exps.index(filename[:3])]
try:
dataframes[key] = pd.concat([dataframes[key], _df], ignore_index=True)
except KeyError:
dataframes[key] = _df
print(dataframes['abc'])
a b c
0 7 8 9
1 10 11 12
2 1 2 3
3 4 5 6
print(dataframes['def'])
a b c
0 7 8 9
1 10 11 12
以上文件的内容为:
abcdefghijkl.txt
a,b,c
7,8,9
10,11,12
abcdef.txt
a,b,c
1,2,3
4,5,6
defghijkl.txt
a,b,c
7,8,9
10,11,12
这篇关于根据文件名将csv文件放入单独的pandas数据帧中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!