如何将具有变化列表(作为字典值)的嵌套json结构转换为数据框

本文介绍了如何将具有变化列表(作为字典值)的嵌套json结构转换为数据框的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我将JSON转换为DataFrame，最后得到一列"Structure_value"，该列具有以下值作为字典/词典列表:

I converted a JSON into DataFrame and ended up with a column 'Structure_value' having below values as a list of dictionary/dictionaries:

                   Structure_value
[{'Room': [6], 'Length': 7}, {'Room': [6], 'Length': 7}]
[{'Room': [6], 'Length': 22}]
[{'Room': [6,6], 'Length': 8}]

我需要将其分为以下四列:

I need to split it into below four columns:

Structure_value_room_1Structure_value_length_1Structure_value_room_2Structure_value_length_2

其输出应如下:

   Structure_value_room_1  Structure_value_length_1  Structure_value_room_2  \
0                       6                         7                     6.0
1                       6                        22                     NaN
2                       6                         8                     6.0

   Structure_value_length_2
0                       7.0
1                       NaN
2                       8.0

如何处理单个属性在单个列表中具有多个值的情况，我们需要将它们拆分为其他列.

How to handle such cases where a single attribute has multiple values in a single list and we need to split them into other columns.

附言:我可以处理以下类型的数据如下情况:[{'Room': [6], 'Length': 7}, {'Room': [6], 'Length': 7}]，但我无法处理这种情况[{'Room': [6,6], 'Length': 8}].

P.S.: I am able to handle these type of cases where data is like this : [{'Room': [6], 'Length': 7}, {'Room': [6], 'Length': 7}] but I am unable to handle this case [{'Room': [6,6], 'Length': 8}].

推荐答案

我无法将您的Structure_value表示形式作为json文件处理，我不知道它们是否代表许多单个文件.我使用了[{'Room':[6]，'Length':7}，{'Room':[6]，'Length':7}]作为file1和[{'Room':[6]，'Length ':22}]作为文件2，[{'Room':[6,6]，'Length':8}]作为文件3.

I could not handle your Structure_value presentation as a json file, I don't know if they represent many single files.I used [{'Room': [6], 'Length': 7}, {'Room': [6], 'Length': 7}] as file1 and [{'Room': [6], 'Length': 22}] as file2 and [{'Room': [6,6], 'Length': 8}] as file3.

#treat the irregular structures
def process_structure(s):

    specs = []

    for label,quantity in s.items():

        if isinstance(quantity,list):
            specs.append(label)
            for elem in quantity:
                specs.append(elem)
        elif isinstance(quantity,int):
            specs.append(label)
            specs.append(quantity)

    return specs

#open and treat jsons
def treat_json(file):

    with open(file, 'r') as f:

        dicts   = {}
        to_df   = []
        load_df = []

        valRoom = 0
        valLen  = 0

        structures = json.load(f)

        for dicts in structures:

            to_df = process_structure(dicts)
            long  = len(to_df)

            for i in range(0,long):

                if to_df[i] == 'Room':
                    valRoom = to_df[i+1]
                    load_df.append(valRoom)
                elif to_df[i] == 'Length':
                    valLen = to_df[i+1]
                    load_df.append(valLen)
                elif isinstance(to_df[i],int) and i < (long - 1):
                    if isinstance(to_df[i+1],int):
                        load_df.append(to_df[i+1])
                        load_df.append(valLen)#repeat Length

        while len(load_df) < 4: #if its no complete
            load_df.append(None)

        df_temp = pd.DataFrame([load_df],columns=['Structure_value_room_1','Structure_value_length_1','Structure_value_room_2','Structure_value_length_2'])

    return df_temp

那是照片:

treat_json('house3.json')
    Structure_value_room_1  ...  Structure_value_length_2
0                       6  ...                         8

[1 rows x 4 columns]

treat_json('house2.json')
    Structure_value_room_1  ...  Structure_value_length_2
0                       6  ...                      None

[1 rows x 4 columns]

treat_json('house1.json')

    Structure_value_room_1  ...  Structure_value_length_2
0                       6  ...                         7

[1 rows x 4 columns]

这篇关于如何将具有变化列表(作为字典值)的嵌套json结构转换为数据框的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！