normalize产生令人困惑的

normalize产生令人困惑的

本文介绍了大 pandas json_normalize产生令人困惑的`KeyError`消息?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将嵌套的JSON转换为Pandas数据框.我一直成功地使用json_normalize,直到遇到某个JSON.我制作了一个较小的版本来重新创建问题.

I'm trying to convert a nested JSON to a Pandas dataframe. I've been using json_normalize with success until I came across a certain JSON. I've made a smaller version of it to recreate the problem.

from pandas.io.json import json_normalize

json=[{"events": [{"schedule": {"date": "2015-08-27",
     "location": {"building": "BDC", "floor": 5},
     "ID": 815},
    "group": "A"},
   {"schedule": {"date": "2015-08-27",
     "location": {"building": "BDC", "floor": 5},
 "ID": 816},
"group": "A"}]}]

然后我运行:

json_normalize(json[0],'events',[['schedule','date'],['schedule','location','building'],['schedule','location','floor']])

期望看到这样的东西:

ID      group   schedule.date   schedule.location.building schedule.location.floor
'815'   'A'     '2015-08-27'            'BDC'                       5
'816'   'A'     '2015-08-27'            'BDC'                       5

但是相反,我得到了这个错误:

But instead I get this error:

In [2]: json_normalize(json[0],'events',[['schedule','date'],['schedule','location','building'],['schedule','location','floor']])
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-2-b588a9e3ef1d> in <module>()
----> 1 json_normalize(json[0],'events',[['schedule','date'],['schedule','location','building'],['schedule','location','floor']])

/Users/logan/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pandas/io/json.pyc in json_normalize(data, record_path, meta, meta_prefix, record_prefix)
    739                 records.extend(recs)
    740
--> 741     _recursive_extract(data, record_path, {}, level=0)
    742
    743     result = DataFrame(records)

/Users/logan/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pandas/io/json.pyc in _recursive_extract(data, path, seen_meta, level)
    734                         meta_val = seen_meta[key]
    735                     else:
--> 736                         meta_val = _pull_field(obj, val[level:])
    737                     meta_vals[key].append(meta_val)
    738

/Users/logan/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pandas/io/json.pyc in _pull_field(js, spec)
    674         if isinstance(spec, list):
    675             for field in spec:
--> 676                 result = result[field]
    677         else:
    678             result = result[spec]

KeyError: 'schedule'

推荐答案

在这种情况下,我认为您将使用此方法:

In this case, I think you'd just use this:

In [57]: json_normalize(data[0]['events'])
Out[57]:
  group  schedule.ID schedule.date schedule.location.building  \
0     A          815    2015-08-27                        BDC
1     A          816    2015-08-27                        BDC

   schedule.location.floor
0                        5
1                        5

meta路径([['schedule','date']...])用于以与记录相同的嵌套级别(即与事件"相同的级别)指定数据.看起来json_normalize不能很好地处理带有嵌套列表的字典,因此,如果实际数据要复杂得多,则可能需要进行一些手动重塑.

The meta paths ([['schedule','date']...]) are for specifying data at the same level of nesting as your records, i.e. at the same level as 'events'. It doesn't look like json_normalize handles dicts with nested lists particularly well, so you may need to do some manual reshaping if your actual data is much more complicated.

这篇关于大 pandas json_normalize产生令人困惑的`KeyError`消息?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-21 06:47