递归子文件夹搜索并返回列表python中的文件

递归子文件夹搜索并返回列表python中的文件

本文介绍了递归子文件夹搜索并返回列表python中的文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究一个脚本,以递归方式遍历主文件夹中的子文件夹,并根据某种文件类型构建一个列表.我的脚本有问题.当前设置如下

I am working on a script to recursively go through subfolders in a mainfolder and build a list off a certain file type. I am having an issue with the script. Its currently set as follows

for root, subFolder, files in os.walk(PATH):
    for item in files:
        if item.endswith(".txt") :
            fileNamePath = str(os.path.join(root,subFolder,item))

问题在于subFolder变量提取的是子文件夹列表,而不是ITEM文件所在的文件夹.我曾考虑过为子文件夹运行一个for循环,然后加入路径的第一部分,但我想出了ID双重检查以了解在此之前是否有人提出任何建议.感谢您的帮助!

the problem is that the subFolder variable is pulling in a list of subfolders rather than the folder that the ITEM file is located. I was thinking of running a for loop for the subfolder before and join the first part of the path but I figured Id double check to see if anyone has any suggestions before that. Thanks for your help!

推荐答案

您应该使用称为rootdirpath.提供了dirnames,以便在不希望将os.walk递归到其中的文件夹时进行修剪.

You should be using the dirpath which you call root. The dirnames are supplied so you can prune it if there are folders that you don't wish os.walk to recurse into.

import os
result = [os.path.join(dp, f) for dp, dn, filenames in os.walk(PATH) for f in filenames if os.path.splitext(f)[1] == '.txt']

在最近一次投票之后,我想到glob是按扩展名选择的更好工具.

After the latest downvote, it occurred to me that glob is a better tool for selecting by extension.

import os
from glob import glob
result = [y for x in os.walk(PATH) for y in glob(os.path.join(x[0], '*.txt'))]

也是生成器版本

from itertools import chain
result = (chain.from_iterable(glob(os.path.join(x[0], '*.txt')) for x in os.walk('.')))

Edit2 for Python 3.4 +

from pathlib import Path
result = list(Path(".").rglob("*.[tT][xX][tT]"))

这篇关于递归子文件夹搜索并返回列表python中的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-28 08:18