问题描述
出于调试目的,我需要在目录中递归搜索所有以UTF-8字节顺序标记(BOM)开头的文件.我当前的解决方案是一个简单的shell脚本:
For debugging purposes, I need to recursively search a directory for all files which start with a UTF-8 byte order mark (BOM). My current solution is a simple shell script:
find -type f |
while read file
do
if [ "`head -c 3 -- "$file"`" == $'\xef\xbb\xbf' ]
then
echo "found BOM in: $file"
fi
done
或者,如果您希望使用简短的,不可读的单层纸:
Or, if you prefer short, unreadable one-liners:
find -type f|while read file;do [ "`head -c3 -- "$file"`" == $'\xef\xbb\xbf' ] && echo "found BOM in: $file";done
不适用于包含换行符的文件名,但是无论如何都不会出现此类文件.
It doesn't work with filenames that contain a line break,but such files are not to be expected anyway.
有没有更短或更优雅的解决方案?
Is there any shorter or more elegant solution?
有没有有趣的文本编辑器或用于文本编辑器的宏?
Are there any interesting text editors or macros for text editors?
推荐答案
这个简单的命令不仅查找而且清除了讨厌的BOM,该怎么办? :)
What about this one simple command which not just finds but clears the nasty BOM? :)
find . -type f -exec sed '1s/^\xEF\xBB\xBF//' -i {} \;
我喜欢发现":)
警告以上内容将修改包含这三个字符的二进制文件.
Warning The above will modify binary files which contain those three characters.
如果只想显示BOM表文件,请使用此文件:
If you want just to show BOM files, use this one:
grep -rl $'\xEF\xBB\xBF' .
这篇关于用BOM搜索UTF-8文件的简便方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!