本文介绍了转储至JSON会添加其他双引号和引号转义的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Python工具检索Twitter数据,并将这些数据以JSON格式转储到我的磁盘上.我注意到一条推文用双引号引起了整个数据字符串的意外转义.此外,实际JSON格式的所有双引号都以反斜杠转义.

I am retrieving Twitter data with a Python tool and dump these in JSON format to my disk. I noticed an unintended escaping of the entire data-string for a tweet being enclosed in double quotes. Furthermore, all double quotes of the actual JSON formatting are escaped with a backslash.

它们看起来像这样:

如何避免这种情况?应该是:

How do I avoid that? It should be:

我的文件输出代码如下:

My file-out code looks like this:

with io.open('data'+self.timestamp+'.txt', 'a', encoding='utf-8') as f:
            f.write(unicode(json.dumps(data, ensure_ascii=False)))
            f.write(unicode('\n'))

意外的转义会在以后的处理步骤中读取JSON文件时引起问题.

The unintended escaping causes problems when reading in the JSON file in a later processing step.

推荐答案

您正在对JSON字符串进行双重编码. data已经已经是JSON字符串,并且不需要再次 进行编码:

You are double encoding your JSON strings. data is already a JSON string, and doesn't need to be encoded again:

>>> import json
>>> not_encoded = {"created_at":"Fri Aug 08 11:04:40 +0000 2014"}
>>> encoded_data = json.dumps(not_encoded)
>>> print encoded_data
{"created_at": "Fri Aug 08 11:04:40 +0000 2014"}
>>> double_encode = json.dumps(encoded_data)
>>> print double_encode
"{\"created_at\": \"Fri Aug 08 11:04:40 +0000 2014\"}"

只需将这些直接写到您的文件中即可

Just write these directly to your file:

with open('data{}.txt'.format(self.timestamp), 'a') as f:
    f.write(data + '\n')

这篇关于转储至JSON会添加其他双引号和引号转义的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-04 11:20
查看更多