问题描述
有没有一种方法可以查找
带有非ASCII字符的文件?我当然可以使用管道-并用perl过滤文件,但是为了提高效率,我想全部在 find
中进行设置.我尝试了以下方法:
Is there a way I can find
files with non-ascii chars? I could use a pipe of course - and filter the files with perl, but for efficiency I'd like to set it all in find
. I tried the following:
find . -type f -name '*[^[:ascii:]]*'
它根本不起作用.
修改:
我现在正在尝试使用
find . -type f -regex '.*[^[:ascii:]].*'
这是一个emacs正则表达式,它具有 [:ascii:]
类.但是我要使用的表达式不起作用.
It is an emacs regexp and it has [:ascii:]
class. But the expression I'm trying to use doesn't work.
编辑2 :
LC_COLLATE=C find . -type f -regex '.*[^!-~].*'
使用非ASCII字符(完整的voodoo ...)匹配文件.还要匹配名称中带有空格的文件.
matches files with non-ascii chars (a complete voodoo...). But also matches files with a space in the name.
推荐答案
在默认模式和posix扩展模式下,这似乎都对我有用:
This seems to work for me in both default and posix-extended mode:
LC_COLLATE=C find . -regex '.*[^ -~].*'
但是,可能存在与语言环境相关的问题,并且我没有大量的非ascii文件名来进行测试,但是可以捕捉到我所拥有的文件名.
There could be locale-related issues, though, and I don't have a large corpus of non-ascii filenames to test it on, but it catches the ones I have.
这篇关于查找文件名中带有非ASCII字符的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!