我有一个pandas数据框,其中一列填充有类对象,如下面的代码:

import pandas as pd
class rec:
    def test(self, a):
        return a
class rec1:
    def test(self, a):
        return a*3
x= rec()
y = rec1()
list = [x,y]
df=pd.DataFrame(list, columns=['first'])
df['second']=['a1','b1']

print(df)
                                          first second
0   <__main__.rec object at 0x000000180AAE9208>     a1
1  <__main__.rec1 object at 0x000000180AACBEB8>     b1


现在,我希望通过将“ test”方法应用于“ first”列,并从“ second”列中读取“ test”的输入来创建新列。
此循环有效:

df['third']=0
for i in (0,1):
 df['third'][i] = df['first'][i].test(df['second'][i])


但我想知道是否可以避免循环并使用与以下代码更类似的东西(不起作用):

df['third'] = df['first'].test(df['second'])


有什么建议吗?谢谢

最佳答案

实际上,这并不难。您可以使用np.vectorize

f = lambda x, y: x.test(y)
v = np.vectorize(f)

df['third'] = v(df['first'], df['second'])

df
                                   first second   third
0   <__main__.rec object at 0x1038b1ef0>     a1      a1
1  <__main__.rec1 object at 0x1038b1c18>     b1  b1b1b1

10-04 21:47
查看更多