SimpleJson处理相同命名实体

SimpleJson处理相同命名实体

本文介绍了SimpleJson处理相同命名实体的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在应用程序引擎中使用Alchemy API,因此我在使用simplejson库来解析响应.问题在于,响应中包含具有sme名称的条目

I'm using the Alchemy API in app engine so I'm using the simplejson library to parse responses. The problem is that the responses have entries that have the sme name

 {
    "status": "OK",
    "usage": "By accessing AlchemyAPI or using information generated by AlchemyAPI, you are agreeing to be bound by the AlchemyAPI Terms of Use: http://www.alchemyapi.com/company/terms.html",
    "url": "",
    "language": "english",
    "entities": [
        {
            "type": "Person",
            "relevance": "0.33",
            "count": "1",
            "text": "Michael Jordan",
            "disambiguated": {
                "name": "Michael Jordan",
                "subType": "Athlete",
                "subType": "AwardWinner",
                "subType": "BasketballPlayer",
                "subType": "HallOfFameInductee",
                "subType": "OlympicAthlete",
                "subType": "SportsLeagueAwardWinner",
                "subType": "FilmActor",
                "subType": "TVActor",
                "dbpedia": "http://dbpedia.org/resource/Michael_Jordan",
                "freebase": "http://rdf.freebase.com/ns/guid.9202a8c04000641f8000000000029161",
                "umbel": "http://umbel.org/umbel/ne/wikipedia/Michael_Jordan",
                "opencyc": "http://sw.opencyc.org/concept/Mx4rvViVq5wpEbGdrcN5Y29ycA",
                "yago": "http://mpii.de/yago/resource/Michael_Jordan"
            }
        }
    ]
}

所以问题是重复了"subType",所以加载返回的指令只是"TVActor",而不是列表.反正有解决办法吗?

So the problem is that the "subType" is repeated so the dict that a loads returns is just "TVActor" rather than a list. Is there anyway to go around this?

推荐答案

定义了 rfc 4627 application/json说:

An object is an unordered collection of zero or more name/value pairs

并且:

The names within an object SHOULD be unique.

这意味着AlchemyAPI不应在同一对象内返回多个"subType"名称,并声称它是JSON.

It means that AlchemyAPI should not return multiple "subType" names inside the same object and claim that it is a JSON.

您可以尝试以XML格式(outputMode=xml)请求相同的内容,以避免结果含糊不清或将重复的键值转换为列表:

You could try to request the same in XML format (outputMode=xml) to avoid ambiguity in the results or to convert duplicate keys values into lists:

import simplejson as json
from collections import defaultdict

def multidict(ordered_pairs):
    """Convert duplicate keys values to lists."""
    # read all values into lists
    d = defaultdict(list)
    for k, v in ordered_pairs:
        d[k].append(v)

    # unpack lists that have only 1 item
    for k, v in d.items():
        if len(v) == 1:
            d[k] = v[0]
    return dict(d)

print json.JSONDecoder(object_pairs_hook=multidict).decode(text)

示例

text = """{
  "type": "Person",
  "subType": "Athlete",
  "subType": "AwardWinner"
}"""

输出

{u'subType': [u'Athlete', u'AwardWinner'], u'type': u'Person'}

这篇关于SimpleJson处理相同命名实体的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-13 19:14