不确定如何描述这一点,因此这里是示例:
这是原始文件:
original = pd.DataFrame({'a':[{1:'hi',2:'there'}],'b':[1]})
a b
0 {1: 'hi', 2: 'there'} 1
这是预期的输出,并重新标记了列:
expected = pd.DataFrame({'numbers':[1,2],'text':['hi','there'],'b':[1,1]})
b numbers text
0 1 1 hi
1 1 2 there
编辑:
我试图简化问题,并且解决方案起作用了,但是当应用于我的数据时却没有。这是我用来避免通信丢失的数据:
record_1 = {'1': {
'url': 'https://www.politico.com/magazine',
'title': 'Worst case '},
'2': {
'url': 'https://www.nbcnews.com/pol',
'title': 'Bad Night '},
'3': {
'url': 'https://www.usatoday.com/stor',
'title': "On the anniversary"
}}
record_2 = {'1': {
'url': 'https://www.nytimes.com/maga',
'title': 'Bad Things Happ '},
'2': {
'url': 'https://www.cnn.com/pols',
'title': 'Best Night '}}
original = pd.DataFrame([[1,record_1],[2,record_2]],columns=['position','news_results'])
position news_results
0 1 {'1': {'title': 'Worst case ', 'url': 'https:/...
1 2 {'1': {'title': 'Bad Things Happ ', 'url': 'ht...
这是预期的结果:
data = [[1,1,'https://www.politico.com/magazine','Worst case '],
[1,2,'https://www.nbcnews.com/pol','Bad Night ',],
[1,3,'https://www.usatoday.com/stor',"On the anniversary"],
[2,1,'https://www.nytimes.com/maga','Bad Things Happ '],
[2,2,'https://www.cnn.com/pols','Best Night ']]
expected = pd.DataFrame(data,columns=['position','sub_rank','url','title'])
position sub_rank url title
0 1 1 https://www.politico.com/magazine Worst case
1 1 2 https://www.nbcnews.com/pol Bad Night
2 1 3 https://www.usatoday.com/stor On the anniversary
3 2 1 https://www.nytimes.com/maga Bad Things Happ
4 2 2 https://www.cnn.com/pols Best Night
最佳答案
这是一种方法。...我仍然认为您可以使用更好的构造函数来达到预期的输出。
original.set_index('b').a.apply(pd.Series).stack().\
reset_index(name='text').rename(columns={'level_1':'numbers'})
Out[1623]:
b numbers text
0 1 1 hi
1 1 2 there
编辑
original.set_index('position')['news_results'].apply(pd.Series).stack().apply(pd.Series).reset_index()
Out[1633]:
position level_1 title url
0 1 1 Worst case https://www.politico.com/magazine
1 1 2 Bad Night https://www.nbcnews.com/pol
2 1 3 On the anniversary https://www.usatoday.com/stor
3 2 1 Bad Things Happ https://www.nytimes.com/maga
4 2 2 Best Night https://www.cnn.com/pols
关于python - 将字典扩展到DataFrame中,然后将其添加到原始数据帧中,并带有新列和复制的原始数据,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/47187962/