问题描述
我正在尝试查找文件的扩展名,将其名称作为字符串.我知道我可以使用 os.path.splitext
函数,但如果我的文件扩展名是 .tar.gz
或 .tar,它就不能按预期工作.bz2
因为它提供扩展名为 gz
和 bz2
而不是 tar.gz
和 tar.bz2
> 分别.
所以我决定自己使用模式匹配找到文件的扩展名.
I am trying to find the extension of a file, given its name as a string. I know I can use the function os.path.splitext
but it does not work as expected in case my file extension is .tar.gz
or .tar.bz2
as it gives the extensions as gz
and bz2
instead of tar.gz
and tar.bz2
respectively.
So I decided to find the extension of files myself using pattern matching.
print re.compile(r'^.*[.](?P<ext>tar\.gz|tar\.bz2|\w+)$').match('a.tar.gz')group('ext')
>>> gz # I want this to come as 'tar.gz'
print re.compile(r'^.*[.](?P<ext>tar\.gz|tar\.bz2|\w+)$').match('a.tar.bz2')group('ext')
>>> bz2 # I want this to come 'tar.bz2'
我在模式匹配中使用 (?P...)
因为我也想获得扩展名.
I am using (?P<ext>...)
in my pattern matching as I also want to get the extension.
请帮忙.
推荐答案
>>> print re.compile(r'^.*[.](?P<ext>tar\.gz|tar\.bz2|\w+)$').match('a.tar.gz').group('ext')
gz
>>> print re.compile(r'^.*?[.](?P<ext>tar\.gz|tar\.bz2|\w+)$').match('a.tar.gz').group('ext')
tar.gz
>>>
?运算符试图找到最小匹配,所以不是 .* 也吃.tar",.*?找到允许 .tar.gz 匹配的最小匹配项.
The ? operator tries to find the minimal match, so instead of .* eating ".tar" as well, .*? finds the minimal match that allows .tar.gz to be matched.
这篇关于在python中使用模式匹配获取文件扩展名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!