问题描述
我对Python编程有点陌生.我有一个小的要求,我需要以JSON格式列出给定两周内的所有客户及其金额.
I am a bit new to Python programming. I have a small requirement where in I need to list down all customers and their amounts for a given fortnight in a JSON format.
目前,我以这种方式拥有一个数据框:
Currently, I have a dataframe this way:
FortNight Amount Customer Parameter
Apr-2FN-2018 339632.00 10992 CustomerSales
Apr-2FN-2018 27282.00 10994 CustomerSales
Apr-2FN-2018 26353.00 10995 CustomerSales
Apr-2FN-2018 24797.00 11000 CustomerSales
Apr-2FN-2018 21093.00 10990 CustomerSales
期望的JSON:
"CustomerSales" : [
{"FortNight" : "Apr-2FN-2018",
"Details" :[
{"Customer": "10992","Amount" : 339632.00},
{"Customer": "10994","Amount" : 27282.00},
{"Customer": "10995","Amount" : 26353.00},
{"Customer": "11000","Amount" : 24797.00},
{"Customer": "10990","Amount" : 21093.00}
]
}
]
我尝试过:
dict(df.set_index('Parameter').groupby(level=0).apply(lambda x : ast.literal_eval(x.to_json(orient = 'records', date_format = "iso"))))
它检索此:
[{'CustomerSales':
[{'Customer': '10992', 'Amount': 339632.00, 'FortNight': 'Apr-2FN-2018'}, {'Customer': '10994', 'Amount': 27282.00, 'FortNight': 'Apr-2FN-2018'},{'Customer': '10995', 'Amount': 26353.00, 'FortNight': 'Apr-2FN-2018'},
{'Customer': '11000', 'Amount': 24797.00, 'FortNight': 'Apr-2FN-2018'},
{'Customer': '10990', 'Amount': 21093.00, 'FortNight': 'Apr-2FN-2018'}]}]
我也尝试了其他方法,但徒劳无功.欢迎任何帮助.预先感谢!
I tried other ways too but in vain. Any help is welcome.Thanks in advance!
推荐答案
首先对参数和 FortNight列进行分组,然后在结果分组的行上使用.to_dict()
来产生内部大多数词典:
Start by grouping on both the Parameter and FortNight columns, and using .to_dict()
on the resulting grouped rows to produce the inner-most dictionaries:
details = df.groupby(['Parameter', 'FortNight']).apply(
lambda r: r[['Customer', 'Amount']].to_dict(orient='records'))
这为您提供了一个在Parameter
和FortNight
上具有多个索引的系列,并且值都是正确格式的所有列表,每个条目都有一个包含Customer
和Amount
列的字典.如果需要转换值类型,请先对r[['Customer', 'Amount']]
数据框结果进行转换,然后再对其调用to_dict()
.
This gives you a series with a multi-index over Parameter
and FortNight
, and the values are all the lists in the correct format, each entry a dictionary with Customer
and Amount
columns. If you need to convert the value types, do so on the r[['Customer', 'Amount']]
dataframe result before calling to_dict()
on it.
然后您可以取消堆叠系列化为数据框,为您提供嵌套的参数-> FortNight->详细信息结构;参数值变成列,每个列表由FortNight索引的客户/金额字典:
You can then unstack the series into a dataframe, giving you a nested Parameter -> FortNight -> details structure; the Parameter values become columns, each list of Customer / Amount dictionaries indexed by FortNight:
nested = details.unstack('Parameter')
如果将其变成字典,您将获得一本最正确的字典:
If you turn this into a dictionary, you'd get a dictionary that's mostly correct already:
>>> pprint(grouped.unstack('Parameter').to_dict())
{'CustomerSales': {'Apr-2FN-2018': [{'Amount': 339632.0, 'Customer': '10992'},
{'Amount': 27282.0, 'Customer': '10994'},
{'Amount': 26353.0, 'Customer': '10995'},
{'Amount': 24797.0, 'Customer': '11000'},
{'Amount': 21093.0, 'Customer': '10990'}]}}
但是对于您的格式,您需要将每一列中的值转换为{'FortNight': indexvalue, 'Details': value}
映射的列表,然后然后将整个结构转换为字典:
but for your format, you'd convert the values in each column to a list of {'FortNight': indexvalue, 'Details': value}
mappings, then converting the whole structure to a dictionary:
output = nested.apply(lambda s: [
{s.index.name: idx, 'Details': value}
for idx, value in s.items()
]).to_dict('records')
这将为您提供最终输出:
This gives you your final output:
>>> pprint(output)
[{'CustomerSales': {'Details': [{'Amount': 339632.0, 'Customer': '10992'},
{'Amount': 27282.0, 'Customer': '10994'},
{'Amount': 26353.0, 'Customer': '10995'},
{'Amount': 24797.0, 'Customer': '11000'},
{'Amount': 21093.0, 'Customer': '10990'}],
'FortNight': 'Apr-2FN-2018'}}]
如果需要JSON文档,请使用.to_json(orient='records')
而不是.to_dict('records')
.
If you need a JSON document, use .to_json(orient='records')
rather than .to_dict('records')
.
放在一起作为一个表达式:
Put together as one expression:
df.groupby(['Parameter', 'FortNight']).apply(
lambda r: r[['Customer', 'Amount']].to_dict(orient='records')
).unstack('Parameter').apply(lambda s: [
{s.index.name: idx, 'Details': value}
for idx, value in s.items()]
).to_json(orient='records')
这篇关于使用2级嵌套数组将数据帧转换为JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!