我的数据是这样的:

{u'"57e01311817bc367c030b390"': u'{"ad_since": 2016, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}', u'"57e01311817bc367c030b3a8"': u'{"ad_since": 2012, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}'}

我想把它转换成熊猫数据框。但当我试着
df = pd.DataFrame(response.items())

我得到一个包含两列的数据帧,第一列包含第一个键,第二列包含键的值:
                            0                       1
0  "57e01311817bc367c030b390"   {"ad_since": 2016, "indoor_swimming_pool": "No...
1  "57e01311817bc367c030b3a8"   {"ad_since": 2012, "indoor_swimming_pool": "No...

如何获取每个键的单个列:"ad_since""indoor_swimming_pool""indoor_swimming_pool"?保留第一列,或者获取id作为索引。

最佳答案

您需要通过typestrdict.apply(literal_eval)的列转换为.apply(json.loads),然后使用DataFrame.from_records

import pandas as pd
from ast import literal_eval

response = {u'"57e01311817bc367c030b390"': u'{"ad_since": 2016, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}',
           u'"57e01311817bc367c030b3a8"': u'{"ad_since": 2012, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}'}

df = pd.DataFrame.from_dict(response, orient='index')

print (type(df.iloc[0,0]))
<class 'str'>

df.iloc[:,0] = df.iloc[:,0].apply(literal_eval)

print (pd.DataFrame.from_records(df.iloc[:,0].values.tolist(), index=df.index))
                            ad_since handicapped_access indoor_swimming_pool  \
"57e01311817bc367c030b3a8"      2012                Yes                   No
"57e01311817bc367c030b390"      2016                Yes                   No

                           seaside
"57e01311817bc367c030b3a8"      No
"57e01311817bc367c030b390"      No

import pandas as pd
import json

response = {u'"57e01311817bc367c030b390"': u'{"ad_since": 2016, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}',
           u'"57e01311817bc367c030b3a8"': u'{"ad_since": 2012, "indoor_swimming_pool": "No", "seaside": "No", "handicapped_access": "Yes"}'}


df = pd.DataFrame.from_dict(response, orient='index')
df.iloc[:,0] = df.iloc[:,0].apply(json.loads)


print (pd.DataFrame.from_records(df.iloc[:,0].values.tolist(), index=df.index))
                            ad_since handicapped_access indoor_swimming_pool  \
"57e01311817bc367c030b3a8"      2012                Yes                   No
"57e01311817bc367c030b390"      2016                Yes                   No

                           seaside
"57e01311817bc367c030b3a8"      No
"57e01311817bc367c030b390"      No

09-30 13:51
查看更多