本文介绍了Python DictWriter编写UTF-8编码的CSV文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
:
def utf_8_encoder(unicode_csv_data):
for unicode_csv_data:
yield line.encode('utf-8')
它还有一个 class UnicodeWriter:
。
但是...如何让DictWriter使用这些?他们不是必须在它的中间注入自己,捕获反汇编的字典,并在它们写入文件之前对它们进行编码?
$ p解决方案
如果使用Python 2.7或更高版本,请使用dict解析将字典重新映射到utf- 8之前传递给DictWriter:
#coding:utf-8
import csv
D = {' name':u'马克','pinyin':u'mǎkè'}
f = open('out.csv','wb')
f.write(u'\\\'.encode ('utf8'))#BOM(可选... Excel需要它正确打开UTF-8文件)
w = csv.DictWriter(f,sorted(D.keys()))
w。 writeheader()
w.writerow({k:v.encode('utf8')for k,v in D.items()})
f.close()
您可以使用此想法将UnicodeWriter更新为DictUnicodeWriter:
#coding:utf-8
import csv
import cStringIO
import codecs
class DictUnicodeWriter(object):
def __init __(self,f,fieldnames,dialect = csv.excel,encoding =utf-8,** kwds):
#重定向输出到队列
self.queue = cStringIO .StringIO()
self.writer = csv.DictWriter(self.queue,fieldnames,dialect = dialect,** kwds)
self.stream = f $ b $ self.encoder = codecs.getincrementalencoder (编码)()
def writerow(self,D):
self.writer.writerow({k:v.encode(utf-8)for k,v in D 。
#从队列中读取UTF-8输出...
data = self.queue.getvalue()
data = data.decode(utf-8 )
#...并将其重新编码为目标编码
data = self.encoder.encode(data)
#写入目标流
self.stream.write数据)
#空队列
self.queue.truncate(0)
def writerows(self,rows):
for D in rows:
self.writerow(D)
def writeheader(self):
self.writer.writeheader()
D1 = {'name':u' ,'pinyin':u'Mǎkè'}
D2 = {'name':u'美国','pinyin':u'Měiguó'}
f = open('out.csv','wb ')
f.write(u'\\\'.encode('utf8'))#BOM(可选... Excel需要它正确打开UTF-8文件)
w = DictUnicodeWriter ,sorted(D.keys()))
w.writeheader()
w.writerows([D1,D2])
f.close()
- I have a list of dictionaries containing unicode strings.
- csv.DictWriter can write a list of dictionaries into a CSV file.
- I want the CSV file to be encoded in UTF8.
- The csv module cannot handle converting unicode strings into UTF8.
- The csv module documentation has an example for converting everything to UTF8:
:
def utf_8_encoder(unicode_csv_data):
for line in unicode_csv_data:
yield line.encode('utf-8')
It also has a class UnicodeWriter:
.
But... how do I make DictWriter work with these? Wouldn't they have to inject themselves in the middle of it, to catch the disassembled dictionaries and encode them before it writes them to the file? I don't get it.
解决方案
If using Python 2.7 or later, use a dict comprehension to remap the dictionary to utf-8 before passing to DictWriter:
# coding: utf-8
import csv
D = {'name':u'马克','pinyin':u'mǎkè'}
f = open('out.csv','wb')
f.write(u'\ufeff'.encode('utf8')) # BOM (optional...Excel needs it to open UTF-8 file properly)
w = csv.DictWriter(f,sorted(D.keys()))
w.writeheader()
w.writerow({k:v.encode('utf8') for k,v in D.items()})
f.close()
You can use this idea to update UnicodeWriter to DictUnicodeWriter:
# coding: utf-8
import csv
import cStringIO
import codecs
class DictUnicodeWriter(object):
def __init__(self, f, fieldnames, dialect=csv.excel, encoding="utf-8", **kwds):
# Redirect output to a queue
self.queue = cStringIO.StringIO()
self.writer = csv.DictWriter(self.queue, fieldnames, dialect=dialect, **kwds)
self.stream = f
self.encoder = codecs.getincrementalencoder(encoding)()
def writerow(self, D):
self.writer.writerow({k:v.encode("utf-8") for k,v in D.items()})
# Fetch UTF-8 output from the queue ...
data = self.queue.getvalue()
data = data.decode("utf-8")
# ... and reencode it into the target encoding
data = self.encoder.encode(data)
# write to the target stream
self.stream.write(data)
# empty queue
self.queue.truncate(0)
def writerows(self, rows):
for D in rows:
self.writerow(D)
def writeheader(self):
self.writer.writeheader()
D1 = {'name':u'马克','pinyin':u'Mǎkè'}
D2 = {'name':u'美国','pinyin':u'Měiguó'}
f = open('out.csv','wb')
f.write(u'\ufeff'.encode('utf8')) # BOM (optional...Excel needs it to open UTF-8 file properly)
w = DictUnicodeWriter(f,sorted(D.keys()))
w.writeheader()
w.writerows([D1,D2])
f.close()
这篇关于Python DictWriter编写UTF-8编码的CSV文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!