问题描述
我需要在一个pandas DataFrame中以某种格式格式化Json文件的内容,以便我可以运行pandassql来转换数据并通过评分模型运行它。
文件= C:\scoring_model\json.js(文件内容如下)
{
response:{
version:1.1,
token:dsfgf,
body:{
customer {
customer_id:1234567,
verified:true
},
contact:{
email:mr @ abc.com,
mobile_number:0123456789
},
personal:{
gender:m,
title :博士,
last_name:Muster,
first_name:Max,
family_status:single,
dob: 1985-12-23,
}
}
}
version |令牌| customer_id |验证|电子邮件| mobile_number |性别|
1.1 | dsfgf | 1234567 | true | [email protected] | 0123456789 | m |
title | last_name | first_name | family_status | dob
Dr. |集合|最大|单| | 23.12.1985
我已经看了关于这个话题的所有其他问题,尝试了各种方法来加载JSON文件转换成熊猫
`with open(r'C:\scoring_model\json.js','r')作为f:`
c = pd.read_json(f.read())
`打开(r'C:\scoring_model\json.js','r')为f:`
c = f.readlines()
尝试了pd.Panel解决方案
来自[yo = f.readlines()]的数据帧结果考虑尝试拆分内容每个单元格基于(),并找到一种方法来把拆分内容到不同的列,但没有运气到目前为止。非常感谢您的专业知识。如果你在整个json中作为一个字典(或列表)加载,例如:使用json.load,你可以使用 <$ c i need to format the contents of a Json file in a certain format in a pandas DataFrame so that i can run pandassql to transform the data and run it through a scoring model. file = C:\scoring_model\json.js (contents of 'file' are below) I need the dataframe to look like this (obviously all values on same row, tried to format it best as possible for this question): I have looked at all the other questions on this topic, have tried various ways to load Json file into pandas tried pd.Panel() in this solution Python Pandas: How to split a sorted dictionary in a column of a dataframe with dataframe results from [yo = f.readlines()] thought about trying to split contents of each cell based on ("") and find a way to put the split contents into different columns but no luck so far. Your expertise is greatly appreciated. Thank you in advance. If you load in the entire json as a dict (or list) e.g. using json.load, you can use 这篇关于将Json嵌套到具有特定格式的Pandas DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!
$ p $ 在[11]中:d = {response :{body:{contact:{email:[email protected],mobile_number:0123456789},personal:{last_name:Muster :m,first_name:Max,dob:1985-12-23,family_status:single,title:Dr.,customer:{verified :true,customer_id:1234567}},token:dsfgf,version:1.1}}
在[12]:df = pd。 io.json.json_normalize(d)
在[13]:df.columns = df.columns.map(lambda x:x.split(。)[ - 1])$ b
$ b In [14]:df
Out [14]:
email mobile_number customer_id verified dob family_status first_name gender last_name title token version
0 [email protected] 0123456789 1234567 true 1985 -12-23 single Max M Muster Dr. dsfgf 1.1
{
"response":{
"version":"1.1",
"token":"dsfgf",
"body":{
"customer":{
"customer_id":"1234567",
"verified":"true"
},
"contact":{
"email":"[email protected]",
"mobile_number":"0123456789"
},
"personal":{
"gender": "m",
"title":"Dr.",
"last_name":"Muster",
"first_name":"Max",
"family_status":"single",
"dob":"1985-12-23",
}
}
}
version | token | customer_id | verified | email | mobile_number | gender |
1.1 | dsfgf | 1234567 | true | [email protected] | 0123456789 | m |
title | last_name | first_name |family_status | dob
Dr. | Muster | Max | single | 23.12.1985
`with open(r'C:\scoring_model\json.js', 'r') as f:`
c = pd.read_json(f.read())
`with open(r'C:\scoring_model\json.js', 'r') as f:`
c = f.readlines()
json_normalize
:In [11]: d = {"response": {"body": {"contact": {"email": "[email protected]", "mobile_number": "0123456789"}, "personal": {"last_name": "Muster", "gender": "m", "first_name": "Max", "dob": "1985-12-23", "family_status": "single", "title": "Dr."}, "customer": {"verified": "true", "customer_id": "1234567"}}, "token": "dsfgf", "version": "1.1"}}
In [12]: df = pd.io.json.json_normalize(d)
In [13]: df.columns = df.columns.map(lambda x: x.split(".")[-1])
In [14]: df
Out[14]:
email mobile_number customer_id verified dob family_status first_name gender last_name title token version
0 [email protected] 0123456789 1234567 true 1985-12-23 single Max m Muster Dr. dsfgf 1.1