问题描述
问题
如何使用assign
返回添加了多个新列的原始DataFrame的副本?
How can assign
be used to return a copy of the original DataFrame with multiple new columns added?
期望结果
df = pd.DataFrame({'A': range(1, 5), 'B': range(11, 15)})
>>> df.assign({'C': df.A.apply(lambda x: x ** 2), 'D': df.B * 2})
A B C D
0 1 11 1 22
1 2 12 4 24
2 3 13 9 26
3 4 14 16 28
尝试
上面的示例导致:
ValueError: Wrong number of items passed 2, placement implies 1
.
背景
Pandas中的assign
函数获取连接到新分配的列的相关数据框的副本,例如
The assign
function in Pandas takes a copy of the relevant dataframe joined to the newly assigned column, e.g.
df = df.assign(C=df.B * 2)
>>> df
A B C
0 1 11 22
1 2 12 24
2 3 13 26
3 4 14 28
0.19.2文档表示此功能意味着可以向数据框添加多个列.
The 0.19.2 documentation for this function implies that more than one column can be added to the dataframe.
此外:
关键字是列名.
该函数的源代码声明它接受字典:
The source code for the function states that it accepts a dictionary:
def assign(self, **kwargs):
"""
.. versionadded:: 0.16.0
Parameters
----------
kwargs : keyword, value pairs
keywords are the column names. If the values are callable, they are computed
on the DataFrame and assigned to the new columns. If the values are not callable,
(e.g. a Series, scalar, or array), they are simply assigned.
Notes
-----
Since ``kwargs`` is a dictionary, the order of your
arguments may not be preserved. The make things predicatable,
the columns are inserted in alphabetical order, at the end of
your DataFrame. Assigning multiple columns within the same
``assign`` is possible, but you cannot reference other columns
created within the same ``assign`` call.
"""
data = self.copy()
# do all calculations first...
results = {}
for k, v in kwargs.items():
if callable(v):
results[k] = v(data)
else:
results[k] = v
# ... and then assign
for k, v in sorted(results.items()):
data[k] = v
return data
推荐答案
您可以通过提供每个新列作为关键字参数来创建多列:
You can create multiple column by supplying each new column as a keyword argument:
df = df.assign(C=df['A']**2, D=df.B*2)
通过使用**
将字典作为关键字参数解压缩,可以使您的示例字典正常工作:
I got your example dictionary to work by unpacking the dictionary as keyword arguments using **
:
df = df.assign(**{'C': df.A.apply(lambda x: x ** 2), 'D': df.B * 2})
assign
似乎应该可以使用字典,但是根据您发布的源代码,它目前似乎不受支持.
It seems like assign
should be able to take a dictionary, but it doesn't look to be currently supported based on the source code you posted.
结果输出:
A B C D
0 1 11 1 22
1 2 12 4 24
2 3 13 9 26
3 4 14 16 28
这篇关于 pandas DataFrame.assign参数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!