regex - Perl解析带有嵌入式逗号的CSV文件

我正在解析带有嵌入式逗号的CSV文件，显然，由于这个原因，使用split()有一些限制。

我应该注意的一件事是，带有嵌入式逗号的值被括号，双引号或两者都包围着。

例如:

(日期，名义上)，
“日期，名义”，
“(日期，名义)”

另外，由于某些我不想立即进入的原因，我试图不使用任何模块来执行此操作...

谁能帮我这个忙吗？

最佳答案

这应该做您需要的。它的工作方式与 Text::CSV_PP 中的代码非常相似，但是不允许在字段中使用转义字符，因为您说自己没有

use strict;
use warnings;
use 5.010;

my $re = qr/(?| "\( ( [^()""]* ) \)" |  \( ( [^()]* ) \) |  " ( [^"]* ) " |  ( [^,]* ) ) , \s* /x;

my $line = '(Date, Notional 1), "Date, Notional 2", "(Date, Notional 3)"';

my @fields = "$line," =~ /$re/g;

say "<$_>" for @fields;

输出

<Date, Notional 1>
<Date, Notional 2>
<Date, Notional 3>

更新

这是较旧的Perls(版本10之前)的版本，没有regex分支reset构造。它产生与上面相同的输出

use strict;
use warnings;
use 5.010;

my $re = qr/(?: "\( ( [^()""]* ) \)" |  \( ( [^()]* ) \) |  " ( [^"]* ) " |  ( [^,]* ) ) , \s* /x;

my $line = '(Date, Notional 1), "Date, Notional 2", "(Date, Notional 3)"';

my @fields = grep defined, "$line," =~ /$re/g;

say "<$_>" for @fields;