dataframe列扩展为dataframe列

dataframe列扩展为dataframe列

本文介绍了将dict的pandas dataframe列扩展为dataframe列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Pandas DataFrame,其中一列是一系列字典,像这样:

I have a Pandas DataFrame where one column is a Series of dicts, like this:

   colA  colB                                  colC
0     7     7  {'foo': 185, 'bar': 182, 'baz': 148}
1     2     8  {'foo': 117, 'bar': 103, 'baz': 155}
2     5    10  {'foo': 165, 'bar': 184, 'baz': 170}
3     3     2  {'foo': 121, 'bar': 151, 'baz': 187}
4     5     5  {'foo': 137, 'bar': 199, 'baz': 108}

我希望将dict中的foobarbaz键-值对作为我数据框中的列,这样我最终得到了这一点:

I want the foo, bar and baz key-value pairs from the dicts to be columns in my dataframe, such that I end up with this:

   colA  colB  foo  bar  baz
0     7     7  185  182  148
1     2     8  117  103  155
2     5    10  165  184  170
3     3     2  121  151  187
4     5     5  137  199  108

我该怎么做?

推荐答案

TL; DR

df = df.drop('colC', axis=1).join(pd.DataFrame(df.colC.values.tolist()))

详尽的答案

我们首先定义要使用的DataFrame以及导入的熊猫:

Elaborate answer

We start by defining the DataFrame to work with, as well as a importing Pandas:

import pandas as pd


df = pd.DataFrame({'colA': {0: 7, 1: 2, 2: 5, 3: 3, 4: 5},
                   'colB': {0: 7, 1: 8, 2: 10, 3: 2, 4: 5},
                   'colC': {0: {'foo': 185, 'bar': 182, 'baz': 148},
                    1: {'foo': 117, 'bar': 103, 'baz': 155},
                    2: {'foo': 165, 'bar': 184, 'baz': 170},
                    3: {'foo': 121, 'bar': 151, 'baz': 187},
                    4: {'foo': 137, 'bar': 199, 'baz': 108}}})

colC是字典的pd.Series,我们可以通过将每个字典转换为pd.Series将其转变为pd.DataFrame:

The column colC is a pd.Series of dicts, and we can turn it into a pd.DataFrame by turning each dict into a pd.Series:

pd.DataFrame(df.colC.values.tolist())
# df.colC.apply(pd.Series). # this also works, but it is slow

给出pd.DataFrame:

   foo  bar  baz
0  154  190  171
1  152  130  164
2  165  125  109
3  153  128  174
4  135  157  188

所以我们要做的就是:

  1. colC转换为pd.DataFrame
  2. df
  3. 删除原始的colC
  4. 将转换后的colCdf一起加入
  1. Turn colC into a pd.DataFrame
  2. Delete the original colC from df
  3. Join the convert colC with df

可以单线完成:

df = df.drop('colC', axis=1).join(pd.DataFrame(df.colC.values.tolist()))

现在df的内容为pd.DataFrame:

   colA  colB  foo  bar  baz
0     2     4  154  190  171
1     4    10  152  130  164
2     4    10  165  125  109
3     3     8  153  128  174
4    10     9  135  157  188

这篇关于将dict的pandas dataframe列扩展为dataframe列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-04 02:29