问题描述
比方说,我有一个这样的文件(它不是实际内容,而是十六进制转储):
Let's say, I have a file like this (it's not the actual content but the hexdump):
0000000 \r \n \r \n T h i s i s a f i
0000010 l e \r \n \r \n H e r
0000020 e ' s s o m e t e x t \r \n
000002f
如果我运行以下:
#!/usr/bin/perl
use strict;
use warnings;
use File::Slurp;
$_ = read_file("file.txt");
s/^\s*$//mg;
print;
产生的输出是:
0000000 \n T h i s i s a f i l e \r
0000010 \n \n H e r e ' s s o m e t e
0000020 x t \r \n
显然,空白行没有被剥离.
Apparently, the blank lines aren't stripped.
谁能指出我做错了什么?
Can anyone point out what I'm doing wrong?
推荐答案
在正则表达式中,$
断言可能有点令人困惑.根据文档,它匹配 [es] 行的末尾(或末尾的换行符之前)".所以它的行为大致类似于
In regexes, the $
assertion can be a bit confusing. According to the docs, it "Match[es] the end of the line (or before newline at the end)". So it behaves roughly like
(?=\n\z)|\z
使用 /m
修饰符,这会变成
With the /m
modifier, this changes to
(?=\n)|\z
这意味着 \n
不包含在匹配的子字符串中.你想要:
This means that the \n
is not included in the matched substring. You want:
s/^\s*\n//mg;
现在您的代码中仍有一些问题需要解决.主要是,一次读入整个文件并在其上运行正则表达式几乎没有意义.相反,我会这样做:
Now there remain some points in your code that should be addressed. Mainly, it makes little sense to read in the whole file at once, and run a regex over it. Rather, I'd do:
use strict; use warnings; use autodie;
open my $fh, "<", "file.txt";
while (<$fh>) {
print if /\S/; # print if this line contains at least one non-space character
# this elegantly skips whitespace-only lines.
}
这假设行尾完全由空白字符组成,并以 \n
结尾.这适用于 \r\n
和 \n
行结尾.否则,分配自定义行结尾,如
This assumes that line endings consist entirely of whitespace characters and end with \n
. This holds for both \r\n
and \n
line endings. Else, assign custom line endings like
local $/ = local $\ = "\r\n"; # input and output line endings
while (<$fh>) {
chomp; # remove line endings
print if /\S/; # print adds the line ending again.
}
这篇关于Perl - 不能去除空行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!