本文介绍了Perl正则表达式:如何从CSV行中删除引号内的引号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从CSV文件中获得了一行,其中"作为字段包围符,而,作为字段分隔符作为字符串.有时,数据中的"会破坏字段包围符.我正在寻找一个正则表达式删除这些".

I've got a line from a CSV file with " as field encloser and , as field seperator as a string. Sometimes there are " in the data that break the field enclosers. I'm looking for a regex to remove these ".

我的字符串如下:

my $csv = qq~"123456","024003","Stuff","","28" stuff with more stuff","2"," 1.99 ","",""~;

我看过,但我不知道如何分辨只删除引号

I've looked at this but I don't understand how to tell it to only remove quotes that are

  1. 不在字符串开头
  2. 不在字符串末尾
  3. 没有,
  4. 后面没有,
  1. not at the beginning of the string
  2. not at the end of the string
  3. not preceded by a ,
  4. not followed by a ,

我设法告诉它使用以下代码行同时删除3和4:

I managed to tell it to remove 3 and 4 at the same time with this line of code:

$csv =~ s/(?<!,)"(?!,)//g;

但是,我不能在其中放入^$,因为前行和后行都不喜欢被写为(?<!(^|,)).

However, I cannot fit the ^ and $ in there since the lookahead and lookbehind both do not like being written as (?<!(^|,)).

除了将字符串拆分并从每个元素中删除引号之外,是否只有通过正则表达式才能实现此目标?

Is there a way to achieve this only with a regex besides splitting the string up and removing the quote from each element?

推荐答案

这应该有效:

$csv =~ s/(?<=[^,])"(?=[^,])//g

12表示逗号前后必须至少有一个字符,因此是积极的解决方法. 34表示这些字符可以是逗号以外的任何字符.

1 and 2 implies that there must be at least one character before and after the comma, hence the positive lookarounds. 3 and 4 implies that these characters can be anything but a comma.

这篇关于Perl正则表达式:如何从CSV行中删除引号内的引号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-03 17:55