本文介绍了Python Pandas-拼合嵌套JSON的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用我试图转换为Pandas数据框的嵌套JSON数据. json_normalize 函数提供了一种方法要做到这一点.

Working with Nested JSON data that I am trying to transform to a Pandas dataframe. The json_normalize function offers a way to accomplish this.

{'locations': [{'accuracy': 17,
                'activity': [{'activity': [{'confidence': 100,
                                            'type': 'STILL'}],
                              'timestampMs': '1542652'}],
                'altitude': -10,
                'latitudeE7': 3777321,
                'longitudeE7': -122423125,
                'timestampMs': '1542654',
                'verticalAccuracy': 2}]}

我利用该功能对位置进行了归一化,但是,嵌套部分活动"不是平坦的.

I utilized the function to normalize locations, however, the nested part 'activity' is not flat.

这是我的尝试:

activity_data = json_normalize(d, 'locations', ['activity','type', 'confidence'],
                               meta_prefix='Prefix.',
                               errors='ignore')

DataFrame:

DataFrame:

[{u'activity': [{u'confidence': 100, u'type': ...   -10.0   NaN 377777377   -1224229340 1542652023196

活动"列仍具有嵌套元素,我需要在其自己的列中对其进行拆包.

The Activity column still has nested elements which I need unpacked in its own column.

任何建议/提示将不胜感激.

Any suggestions/tips would be much appreciated.

推荐答案

使用递归展平嵌套的dicts

  • 在Python中进行递归思考
  • 在Python中平整JSON对象
  • 展平
  • 以下函数将用于展平_source_list
  • Use recursion to flatten the nested dicts

    • Thinking Recursively in Python
    • Flattening JSON objects in Python
    • flatten
    • The following function, will be used to flatten _source_list
    • def flatten_json(nested_json: dict, exclude: list=['']) -> dict:
          """
          Flatten a list of nested dicts.
          """
          out = dict()
          def flatten(x: (list, dict, str), name: str='', exclude=exclude):
              if type(x) is dict:
                  for a in x:
                      if a not in exclude:
                          flatten(x[a], f'{name}{a}_')
              elif type(x) is list:
                  i = 0
                  for a in x:
                      flatten(a, f'{name}{i}_')
                      i += 1
              else:
                  out[name[:-1]] = x
      
          flatten(nested_json)
          return out
      

      数据:

      • 要创建数据集,我使用了给定的数据.
      • datajson
      • Data:

        • To create the dataset, I used the given data.
        • data is a json
        • data = {'locations': [{'accuracy': 17,'activity': [{'activity': [{'confidence': 100,'type': 'STILL'}],'timestampMs': '1542652'}],'altitude': -10,'latitudeE7': 3777321,'longitudeE7': -122423125,'timestampMs': '1542654','verticalAccuracy': 2},
                                {'accuracy': 17,'activity': [{'activity': [{'confidence': 100,'type': 'STILL'}],'timestampMs': '1542652'}],'altitude': -10,'latitudeE7': 3777321,'longitudeE7': -122423125,'timestampMs': '1542654','verticalAccuracy': 2},
                                {'accuracy': 17,'activity': [{'activity': [{'confidence': 100,'type': 'STILL'}],'timestampMs': '1542652'}],'altitude': -10,'latitudeE7': 3777321,'longitudeE7': -122423125,'timestampMs': '1542654','verticalAccuracy': 2}]}
          

          使用flatten_json:

          df = pd.DataFrame([flatten_json(x) for x in data['locations']])
          

          输出:

           accuracy  activity_0_activity_0_confidence activity_0_activity_0_type activity_0_timestampMs  altitude  latitudeE7  longitudeE7 timestampMs  verticalAccuracy
                 17                               100                      STILL                1542652       -10     3777321   -122423125     1542654                 2
                 17                               100                      STILL                1542652       -10     3777321   -122423125     1542654                 2
                 17                               100                      STILL                1542652       -10     3777321   -122423125     1542654                 2
          

          这篇关于Python Pandas-拼合嵌套JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-14 05:39