问题描述
我从CSV文件中获得了一行,其中"
作为字段包围符,而,
作为字段分隔符作为字符串.有时,数据中的"
会破坏字段包围符.我正在寻找一个正则表达式删除这些"
.
I've got a line from a CSV file with "
as field encloser and ,
as field seperator as a string. Sometimes there are "
in the data that break the field enclosers. I'm looking for a regex to remove these "
.
我的字符串如下:
my $csv = qq~"123456","024003","Stuff","","28" stuff with more stuff","2"," 1.99 ","",""~;
我看过此,但我不知道如何分辨只删除引号
I've looked at this but I don't understand how to tell it to only remove quotes that are
- 不在字符串开头
- 不在字符串末尾
- 没有
,
- 后面没有
,
- not at the beginning of the string
- not at the end of the string
- not preceded by a
,
- not followed by a
,
我设法告诉它使用以下代码行同时删除3和4:
I managed to tell it to remove 3 and 4 at the same time with this line of code:
$csv =~ s/(?<!,)"(?!,)//g;
但是,我不能在其中放入^
和$
,因为前行和后行都不喜欢被写为(?<!(^|,))
.
However, I cannot fit the ^
and $
in there since the lookahead and lookbehind both do not like being written as (?<!(^|,))
.
除了将字符串拆分并从每个元素中删除引号之外,是否只有通过正则表达式才能实现此目标?
Is there a way to achieve this only with a regex besides splitting the string up and removing the quote from each element?
推荐答案
这应该有效:
$csv =~ s/(?<=[^,])"(?=[^,])//g
1
和2
表示逗号前后必须至少有一个字符,因此是积极的解决方法. 3
和4
表示这些字符可以是逗号以外的任何字符.
1
and 2
implies that there must be at least one character before and after the comma, hence the positive lookarounds. 3
and 4
implies that these characters can be anything but a comma.
这篇关于Perl正则表达式:如何从CSV行中删除引号内的引号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!