问题描述
我有一个我不想提取的存档,但是要检查其每个内容是文件还是目录.
I have an archive which I do not want to extract but check for each of its contents whether it is a file or a directory.
os.path.isdir和os.path.isfile不起作用,因为我正在处理存档.存档可以是tar,bz2,zip或tar.gz中的任何一个(因此我无法使用其特定的库).另外,该代码应可在任何平台(例如linux或Windows)上运行.有人可以帮我怎么做吗?
os.path.isdir and os.path.isfile do not work because I am working on archive. The archive can be anyone of tar,bz2,zip or tar.gz(so I cannot use their specific libraries). Plus, the code should work on any platform like linux or windows. Can anybody help me how to do it?
推荐答案
您已经声明需要支持"tar,bz2,zip或tar.gz". Python的tarfile
模块将自动处理gz和bz2压缩的tar文件,因此实际上只需要支持两种类型的存档:tar和zip. (bz2本身不是存档格式,只是压缩).
You've stated that you need to support "tar, bz2, zip or tar.gz". Python's tarfile
module will automatically handle gz and bz2 compressed tar files, so there is really only 2 types of archive that you need to support: tar and zip. (bz2 by itself is not an archive format, it's just compression).
您可以使用tarfile.is_tarfile()
确定给定文件是否为tar文件.这也适用于以gzip或bzip2压缩方式压缩的tar文件.在tar文件中,您可以使用TarInfo.isdir()
确定文件是目录,还是使用TarInfo.isfile()
确定文件是文件.
You can determine whether a given file is a tar file with tarfile.is_tarfile()
. This will also work on tar files compressed with gzip or bzip2 compression. Within a tar file you can determine whether a file is a directory using TarInfo.isdir()
or a file with TarInfo.isfile()
.
类似地,您可以使用zipfile.is_zipfile()
确定文件是否为zip文件.对于zipfile
,没有方法可以将目录与普通文件区分开,但是以/
结尾的文件是目录.
Similarly you can determine whether a file is a zip file using zipfile.is_zipfile()
. With zipfile
there is no method to distinguish directories from normal file, but files that end with /
are directories.
因此,给定文件名,您可以执行以下操作:
So, given a file name, you can do this:
import zipfile
import tarfile
filename = 'test.tgz'
if tarfile.is_tarfile(filename):
f = tarfile.open(filename)
for info in f:
if info.isdir():
file_type = 'directory'
elif info.isfile():
file_type = 'file'
else:
file_type = 'unknown'
print('{} is a {}'.format(info.name, file_type))
elif zipfile.is_zipfile(filename):
f = zipfile.ZipFile(filename)
for name in f.namelist():
print('{} is a {}'.format(name, 'directory' if name.endswith('/') else 'file'))
else:
print('{} is not an accepted archive file'.format(filename))
在具有以下结构的tar文件上运行时:
When run on a tar file with this structure:
(py2)[mhawke@localhost tmp]$ tar tvfz /tmp/test.tgz
drwxrwxr-x mhawke/mhawke 0 2016-02-29 12:38 x/
lrwxrwxrwx mhawke/mhawke 0 2016-02-29 12:38 x/4 -> 3
drwxrwxr-x mhawke/mhawke 0 2016-02-28 21:14 x/3/
drwxrwxr-x mhawke/mhawke 0 2016-02-28 21:14 x/3/4/
-rw-rw-r-- mhawke/mhawke 0 2016-02-28 21:14 x/3/4/zzz
drwxrwxr-x mhawke/mhawke 0 2016-02-28 21:13 x/2/
-rw-rw-r-- mhawke/mhawke 0 2016-02-28 21:13 x/2/aa
drwxrwxr-x mhawke/mhawke 0 2016-02-28 21:13 x/1/
-rw-rw-r-- mhawke/mhawke 0 2016-02-28 21:13 x/1/abc
-rw-rw-r-- mhawke/mhawke 0 2016-02-28 21:13 x/1/ab
-rw-rw-r-- mhawke/mhawke 0 2016-02-28 21:13 x/1/a
输出为:
x is a directory
x/4 is a unknown
x/3 is a directory
x/3/4 is a directory
x/3/4/zzz is a file
x/2 is a directory
x/2/aa is a file
x/1 is a directory
x/1/abc is a file
x/1/ab is a file
x/1/a is a file
请注意x/4
是未知"的,因为它是符号链接.
Notice that x/4
is "unknown" because it is a symbolic link.
使用zipfile
没有简单的方法来将符号链接(或其他文件类型)与目录或普通文件区分开.信息位于ZipInfo.external_attr
属性中,但将其撤回很麻烦:
There is no easy way, with zipfile
, to distinguish a symlink (or other file types) from a directory or normal file. The information is there in the ZipInfo.external_attr
attribute, but it's messy to get it back out:
import stat
linked_file = f.filelist[1]
is_symlink = stat.S_ISLNK(linked_file.external_attr >> 16L)
这篇关于如何检查它是python中归档文件的文件还是文件夹?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!