问题描述
如何使用 Windows 命令提示符修剪文本文件中的所有尾随空格?
How could I trim all trailing spaces from a text file using the Windows command prompt?
推荐答案
Ben Hocking 引用的 DosTips RTRIM 函数可用于创建可以正确修剪文本文件中每一行的脚本.但是,该功能相对较慢.
The DosTips RTRIM function that Ben Hocking cites can be used to create a script that can right trim each line in a text file. However, the function is relatively slow.
DosTips 用户(和版主)aGerman 开发了一种非常有效的正确修剪算法.他将该算法实现为批处理宏"——一个将复杂的迷你脚本存储在可以从内存中执行的环境变量中的有趣概念.带参数的宏本身就是一个主要的讨论话题,与这个问题无关.
DosTips user (and moderator) aGerman developed a very efficient right trim algorithm. He implemented the algorithm as a batch "macro" - an interesting concept of storing complex mini scripts in environment variables that can be executed from memory. The macros with arguments are a major discussion topic in and of themselves that is not relevent to this question.
我已经提取了 aGerman 的算法并将其放入以下批处理脚本中.该脚本需要一个文本文件的名称作为唯一的参数,并继续正确修剪文件中每一行的空格.
I have extracted aGerman's algorithm and put it in the following batch script. The script expects the name of a text file as the only parameter and proceeds to right trim the spaces off each line in the file.
@echo off
setlocal enableDelayedExpansion
set "spcs= "
for /l %%n in (1 1 12) do set "spcs=!spcs!!spcs!"
findstr /n "^" "%~1" >"%~1.tmp"
setlocal disableDelayedExpansion
(
for /f "usebackq delims=" %%L in ("%~1.tmp") do (
set "ln=%%L"
setlocal enableDelayedExpansion
set "ln=!ln:*:=!"
set /a "n=4096"
for /l %%i in (1 1 13) do (
if defined ln for %%n in (!n!) do (
if "!ln:~-%%n!"=="!spcs:~-%%n!" set "ln=!ln:~0,-%%n!"
set /a "n/=2"
)
)
echo(!ln!
endlocal
)
) >"%~1"
del "%~1.tmp" 2>nul
假设脚本名为rtrimFile.bat,那么可以从命令行调用如下:
Assuming the script is called rtrimFile.bat, then it can be called from the command line as follows:
rtrimFile "fileName.txt"
关于性能的说明
原始 DosTips rtrim 函数执行线性搜索,默认最多修剪 32 个空格.每个空间必须迭代一次.
A note about performance
The original DosTips rtrim function performs a linear search and defaults to trimming a maximum of 32 spaces. It has to iterate once per space.
aGerman 的算法使用二分搜索,它能够在 13 次迭代中修整批允许的最大字符串大小(最多约 8k 个空格).
aGerman's algorithm uses a binary search and it is able to trim the maximum string size allowed by batch (up to ~8k spaces) in 13 iterations.
不幸的是,批处理在处理文本时非常慢.即使使用高效的 rtrim 功能,在我的机器上修剪一个 1MB 的文件也需要大约 70 秒.问题是,在没有任何修改的情况下读取和写入文件需要大量时间.此答案使用 FOR 循环来读取文件,并结合 FINDSTR 为每行添加行号前缀,以便保留空行.它切换延迟扩展以防止 !
被破坏,并使用搜索和替换操作从每行中删除行号前缀.所有这一切甚至在它开始进行 rtrim 之前.
Unfotunately, batch is very SLOW when it comes to processing text. Even with the efficient rtrim function, it takes ~70 seconds to trim a 1MB file on my machine. The problem is, just reading and writing the file without any modification takes significant time. This answer uses a FOR loop to read the file, coupled with FINDSTR to prefix each line with the line number so that blank lines are preserved. It toggles delayed expansion to prevent !
from being corrupted, and uses a search and replace operation to remove the line number prefix from each line. All that before it even begins to do the rtrim.
使用替代文件读取,性能几乎翻倍使用 set/p
的机制.但是,set/p 方法限制为每行约 1k 字节,并且它会从每行中去除尾随控制字符.
Performance could be nearly doubled by using an alternate file read mechanism that uses set /p
. However, the set /p method is limited to ~1k bytes per line, and it strips trailing control characters from each line.
如果您需要定期修剪大文件,那么即使性能翻倍也可能不够.是时候下载(如果可能)可以在眨眼间处理文件的众多实用程序中的任何一个.
If you need to regularly trim large files, then even a doubling of performance is probably not adequate. Time to download (if possible) any one of many utilities that could process the file in the blink of an eye.
如果你不能使用非本地软件,那么你可以尝试通过 CSCRIPT 批处理命令执行的 VBScript 或 JScript.任何一个都会快得多.
If you can't use non-native software, then you can try VBScript or JScript excecuted via the CSCRIPT batch command. Either one would be MUCH faster.
更新 - 使用 JREPL.BAT 的快速解决方案
JREPL.BAT 是一个正则表达式查找/替换实用程序可以非常有效地解决问题.它是纯脚本(混合批处理/JScript),可以在 XP 以后的任何 Windows 机器上本地运行.不需要第三方 exe 文件.
JREPL.BAT is a regular expression find/replace utility that can very efficiently solve the problem. It is pure script (hybrid batch/JScript) that runs natively on any Windows machine from XP onward. No 3rd party exe files are needed.
使用 JREPL.BAT 在 PATH 中的某处,您可以使用以下简单命令从文件test.txt"中去除尾随空格:
With JREPL.BAT somewhere within your PATH, you can strip trailing spaces from file "test.txt" with this simple command:
jrepl " +$" "" /f test.txt /o -
如果您将命令放在批处理脚本中,则必须在命令之前使用 CALL:
If you put the command within a batch script, then you must precede the command with CALL:
call jrepl " +$" "" /f test.txt /o -
这篇关于使用 Windows 批处理从文件中删除尾随空格?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!