问题描述
我有一个非常大的CSV文件,其中包含双引号字段(因为
它们包含逗号)。不幸的是,其中一些字段还包含
其他双引号,我犯了一个痛苦的错误:忘记了
转义或者字段内的引号加倍:
123,这是一些,文字和一些引用的文字引号应该是
加倍,321
有没有人以前处理过这个问题?算法I
的任何想法都可以用于Python脚本来创建一个新的,修复过的CSV文件?
TIA,
Ryan
I have a very large CSV file that contains double quoted fields (since
they contain commas). Unfortunately, some of these fields also contain
other double quotes and I made the painful mistake of forgetting to
escape or double the quotes inside the field:
123,"Here is some, text "and some quoted text" where the quotes should
have been doubled",321
Has anyone dealt with this problem before? Any ideas of an algorithm I
can use for a Python script to create a new, repaired CSV file?
TIA,
Ryan
推荐答案
rec =''''''123,这是一些,文字和一些引用的文字其中引号
应加倍,321''''
导入csv
csv.reader([rec.replace('',''','','""'''')
.replace(''",'',''' """,'')
.replace(''"""'',''''''''')
.replace('''''','''"''')
.replace("'''''''','''''''')] ).next()
[''123'',''这是一些,文字'和一些引用的文字其中引号
应该加倍'','321'']
:))
Emile
rec = ''''''123,"Here is some, text "and some quoted text" where the quotes
should have been doubled",321''''''
import csv
csv.reader([rec.replace('',"'','',"""'')
.replace(''",'',''""",'')
.replace(''"""'',"''''''")
.replace(''"'',''""'')
.replace("''''''",''"'')]).next()
[''123'', ''Here is some, text "and some quoted text" where the quotes
should have been doubled'', ''321'']
:))
Emile
rec =''''''123,这是一些,文字和一些引用的文字其中引号
应加倍,321''''
导入csv
csv.reader([rec.replace('',''','','""'''')
* * * * * * * * .replace('' ",'',''""",'')
* * * * * * * * .replace(''"""''," '''''''')
* * * * * * * * .replace(''"'',''""'')
* * * * * * * * .replace("''''''','''"'')])。next()
[ ''123'',''这是一些,文字'和一些引用的文字其中引号
应该加倍'','321'']
:))
Emile
rec = ''''''123,"Here is some, text "and some quoted text" where the quotes
should have been doubled",321''''''
import csv
csv.reader([rec.replace('',"'','',"""'')
* * * * * * * * .replace(''",'',''""",'')
* * * * * * * * .replace(''"""'',"''''''")
* * * * * * * * .replace(''"'',''""'')
* * * * * * * * .replace("''''''",''"'')]).next()
[''123'', ''Here is some, text "and some quoted text" where the quotes
should have been doubled'', ''321'']
:))
Emile
谢谢Emile!工作几乎完美,但有什么方法我可以
适应这个引用包含逗号的字段吗?
TIA,
Ryan
Thanks Emile! Works almost perfectly, but is there some way I can
adapt this to quote fields that contain a comma in them?
TIA,
Ryan
你原来说的我有一个非常大的CSV文件,其中包含两个
引用字段(因为它们包含逗号)。你现在说那个
如果一个字段包含一个逗号,你是不是用引号括起来?或者
这是一个与原始问题无关的单独问题吗?
You originally said "I have a very large CSV file that contains double
quoted fields (since they contain commas)". Are you now saying that
if a field contained a comma, you didn''t wrap the field in quotes? Or
is this a separate question unrelated to your original problem?
这篇关于尝试修复无效的CSV文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!