{'endDate': '2017-12-31',
 'results': [{'data': [{'period': '2017-01-01', 'ratio': 26.91301},
    {'period': '2017-02-01', 'ratio': 19.77063},
    {'period': '2017-03-01', 'ratio': 20.40775},
    {'period': '2017-04-01', 'ratio': 16.02843},
    {'period': '2017-05-01', 'ratio': 10.38159},
    {'period': '2017-06-01', 'ratio': 8.2087},
    {'period': '2017-07-01', 'ratio': 8.67815},
    {'period': '2017-08-01', 'ratio': 19.58956},
    {'period': '2017-09-01', 'ratio': 36.94587},
    {'period': '2017-10-01', 'ratio': 36.28194},
    {'period': '2017-11-01', 'ratio': 16.64543},
    {'period': '2017-12-01', 'ratio': 1.67661}],
   'keywords': ['Data_spec'],
   'title': 'Data_spec'},
  {'data': [{'period': '2017-01-01', 'ratio': 17.65139},
    {'period': '2017-02-01', 'ratio': 14.52618},
    {'period': '2017-03-01', 'ratio': 15.00234},
    {'period': '2017-04-01', 'ratio': 12.1521},
    {'period': '2017-05-01', 'ratio': 10.63644},
    {'period': '2017-06-01', 'ratio': 8.59767},
    {'period': '2017-07-01', 'ratio': 8.95312},
    {'period': '2017-08-01', 'ratio': 13.05747},
    {'period': '2017-09-01', 'ratio': 48.00482},
    {'period': '2017-10-01', 'ratio': 23.7811},
    {'period': '2017-11-01', 'ratio': 16.90027},
    {'period': '2017-12-01', 'ratio': 0.89866}],
   'keywords': ['Data_rate'],
   'title': 'Date_rate'},
  {'data': [], 'keywords': ['Data_over'], 'title': 'Data_over'},
  {'data': [{'period': '2017-01-01', 'ratio': 79.17644},
    {'period': '2017-02-01', 'ratio': 84.01851},
    {'period': '2017-03-01', 'ratio': 100.0},
    {'period': '2017-04-01', 'ratio': 91.19442},
    {'period': '2017-05-01', 'ratio': 93.21976},
    {'period': '2017-06-01', 'ratio': 93.42096},
    {'period': '2017-07-01', 'ratio': 89.14895},
    {'period': '2017-08-01', 'ratio': 91.85165},
    {'period': '2017-09-01', 'ratio': 91.24136},
    {'period': '2017-10-01', 'ratio': 90.35611},
    {'period': '2017-11-01', 'ratio': 81.88585},
    {'period': '2017-12-01', 'ratio': 7.49111}],
   'keywords': ['Data_under'],
   'title': 'Data_under'},
  {'data': [{'period': '2017-01-01', 'ratio': 0.70417},
    {'period': '2017-02-01', 'ratio': 1.11997},
    {'period': '2017-03-01', 'ratio': 1.81074},
    {'period': '2017-04-01', 'ratio': 1.38823},
    {'period': '2017-05-01', 'ratio': 0.97914},
    {'period': '2017-06-01', 'ratio': 1.14009},
    {'period': '2017-07-01', 'ratio': 0.78465},
    {'period': '2017-08-01', 'ratio': 1.07973},
    {'period': '2017-09-01', 'ratio': 0.94561},
    {'period': '2017-10-01', 'ratio': 0.85172},
    {'period': '2017-11-01', 'ratio': 1.27422},
    {'period': '2017-12-01', 'ratio': 0.08718}],
   'keywords': ['Data_tune'],
   'title': 'Data_tune'}],
 'startDate': '2017-01-01',
 'timeUnit': 'month'}


上面是'my_dict = json.loads(js)',其中js是字符串。我正在尝试将这些数据放入Pandas DataFrame。我在下面使用了代码。

lst = [pd.DataFrame.from_dict(r['data']).set_index('period').rename(columns={'ratio' : r['title']})
           for r in d['results']]
df = pd.concat(lst, 1)


在此js集具有空值之前,我的代码运行良好。
一个问题是您已经注意到'keyworks':['Data_over']具有空的'data'。所以我不能将index设置为'period'。我仍然希望在熊猫DF中使用“ Data_over”,但为空。是否可以将“ Data_over”作为列名设置为DataFrame,但将其值设置为空?所以我的最终代码可以将json转换为带有或不带有“数据”的df。

以下是我想要的输出。 (实际价值有所不同,但您有这个概念)

    Data_spec    Data_rate   Data_over   Data_under   Data_tune
2017-01-01  0.55116     NaN     NaN         7.12056     2.25329
2017-02-01  0.32016     0.08915     NaN         6.43161     1.19959
2017-03-01  0.32421     0.10131     NaN         6.48024     1.30091
2017-04-01  0.33232     0.01215     NaN         6.05471     1.26038
2017-05-01  0.39311     0.12968     NaN         6.19655     1.21985
2017-06-01  0.47011     0.03647     NaN         5.71023     1.03748
2017-07-01  4.32016     NaN     NaN         11.85005    0.84295
2017-08-01  8.81053     0.04052     NaN         51.44072    0.89564
2017-09-01  14.46808    0.02836     NaN         100.00000   0.85511
2017-10-01  4.27152     0.10942     NaN         34.65451    0.87132
2017-11-01  0.29989     0.05673     NaN         13.02127    0.77811
2017-12-01  0.00810     0.06079     NaN         0.80243     NaN

最佳答案

可以通过length检查len(r['data']) > 0并过滤出list comprehension中的空DataFrame:

lst = [pd.DataFrame(r['data']).set_index('period').rename(columns={'ratio' : r['title']})
           for r in d['results'] if len(r['data']) > 0]
df = pd.concat(lst, 1)
print (df)

            Data_spec  Date_rate  Data_under  Data_tune
period
2017-01-01   26.91301   17.65139    79.17644    0.70417
2017-02-01   19.77063   14.52618    84.01851    1.11997
2017-03-01   20.40775   15.00234   100.00000    1.81074
2017-04-01   16.02843   12.15210    91.19442    1.38823
2017-05-01   10.38159   10.63644    93.21976    0.97914
2017-06-01    8.20870    8.59767    93.42096    1.14009
2017-07-01    8.67815    8.95312    89.14895    0.78465
2017-08-01   19.58956   13.05747    91.85165    1.07973
2017-09-01   36.94587   48.00482    91.24136    0.94561
2017-10-01   36.28194   23.78110    90.35611    0.85172
2017-11-01   16.64543   16.90027    81.88585    1.27422
2017-12-01    1.67661    0.89866     7.49111    0.08718


编辑:

如果r['data']为空,则可能会创建自定义DataFrame,因为使用了d['startDate']对齐索引:

lst = [pd.DataFrame(r['data']).set_index('period').rename(columns={'ratio' : r['title']})
        if len(r['data']) > 0
        else pd.DataFrame([np.nan], columns=[r['title']], index=[d['startDate']])
        for r in d['results'] ]
df = pd.concat(lst, 1)
print (df)
            Data_spec  Date_rate  Data_over  Data_under  Data_tune
2017-01-01   26.91301   17.65139        NaN    79.17644    0.70417
2017-02-01   19.77063   14.52618        NaN    84.01851    1.11997
2017-03-01   20.40775   15.00234        NaN   100.00000    1.81074
2017-04-01   16.02843   12.15210        NaN    91.19442    1.38823
2017-05-01   10.38159   10.63644        NaN    93.21976    0.97914
2017-06-01    8.20870    8.59767        NaN    93.42096    1.14009
2017-07-01    8.67815    8.95312        NaN    89.14895    0.78465
2017-08-01   19.58956   13.05747        NaN    91.85165    1.07973
2017-09-01   36.94587   48.00482        NaN    91.24136    0.94561
2017-10-01   36.28194   23.78110        NaN    90.35611    0.85172
2017-11-01   16.64543   16.90027        NaN    81.88585    1.27422
2017-12-01    1.67661    0.89866        NaN     7.49111    0.08718

关于python - json到DataFrame的转换问题,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/47628924/

10-14 18:17
查看更多