问题描述
我正在尝试在 pandas.DataFrame
中创建一个加权列
我有一个python dictionary
,键为 pandas.DataFrame
列名,值为相应的权重。
I have a python dictionary
with the keys being the pandas.DataFrame
column names and the values the corresponding weights.
我想创建一个新列,该列根据字典
和引用 pandas.DataFrame $ c进行加权$ c>列值。
I would like to create a new column which is weighted based on the dictionary
and reference pandas.DataFrame
column values.
例如:
import pandas as pd
import numpy as np
weights = {'IX1' : 0.3, 'IX2' : 0.2, 'IX3' : 0.4, 'IX4' : 0.1}
np.random.seed(0)
df = pd.DataFrame(np.random.randn(10, 3), columns=['IX1', 'IX2', 'IX3'])
##Desired output --- manually combine
df['Composite'] = df['IX1']*0.3 + df['IX2']*0.2 + df['IX3']*0.4
我希望代码仍然可以运行,即使 pandas.DataFrame
缺少列
I would like the code to still run even if the pandas.DataFrame
is missing columns
推荐答案
首先创建,然后选择此列,并与与 Series
from dict仅针对同一列进行过滤:
First create variable for same values for columns and keys in dictionary by Index.intersection
, then select this columns and use matrix multiplication with dot
with Series
from dict filtered for same columns only:
df['Composite'] = df['IX1']*0.3 + df['IX2']*0.2 + df['IX3']*0.4
cols = df.columns.intersection(weights.keys())
df['Composite1'] = df[cols].dot(pd.Series(weights)[cols])
print (df)
IX1 IX2 IX3 Composite Composite1
0 1.764052 0.400157 0.978738 1.000742 1.000742
1 2.240893 1.867558 -0.977278 0.654868 0.654868
2 0.950088 -0.151357 -0.103219 0.213468 0.213468
3 0.410599 0.144044 1.454274 0.733698 0.733698
4 0.761038 0.121675 0.443863 0.430192 0.430192
5 0.333674 1.494079 -0.205158 0.316855 0.316855
6 0.313068 -0.854096 -2.552990 -1.098095 -1.098095
7 0.653619 0.864436 -0.742165 0.072107 0.072107
8 2.269755 -1.454366 0.045759 0.408357 0.408357
9 -0.187184 1.532779 1.469359 0.838144 0.838144
这篇关于 pandas 函数基于dict创建组合列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!