如何在数据帧上应用多个函数:
我想做类似的事情:

features_df[features_columns].apply(lambda x: np.mean(x), lambda x: np.std(x), lambda x: np.skew(x))

谢谢

最佳答案

我认为您需要DataFrame.aggregatepandas 0.20.0+)或DataFrame.apply

features_df[features_columns].agg(lambda x: pd.Series([np.mean(x),np.std(x)]))

features_df[features_columns].apply(lambda x: pd.Series([np.mean(x),np.std(x)]))

df = features_df[features_columns].agg(['mean', 'std', 'skew'])

df = features_df[features_columns].apply(['mean', 'std', 'skew'])

样品:
features_df = pd.DataFrame({'A':list('abcdef'),
                           'B':[4,5,4,5,5,4],
                           'C':[7,8,9,4,2,3],
                           'D':[1,3,5,7,1,0],
                           'E':[5,3,6,9,2,4],
                           'F':list('aaabbb')})

print (features_df)
   A  B  C  D  E  F
0  a  4  7  1  5  a
1  b  5  8  3  3  a
2  c  4  9  5  6  a
3  d  5  4  7  9  b
4  e  5  2  1  2  b
5  f  4  3  0  4  b

features_columns = ['B','C']


print (features_df[features_columns].agg(lambda x: pd.Series([np.mean(x),np.std(x)])))
     B         C
0  4.5  5.500000
1  0.5  2.629956

print (features_df[features_columns].apply(lambda x: pd.Series([np.mean(x),np.std(x)])))
     B         C
0  4.5  5.500000
1  0.5  2.629956

print (features_df[features_columns].agg(['mean', 'std', 'skew']))
             B         C
mean  4.500000  5.500000
std   0.547723  2.880972
skew  0.000000  0.000000

print (features_df[features_columns].apply(['mean', 'std', 'skew']))
             B         C
mean  4.500000  5.500000
std   0.547723  2.880972
skew  0.000000  0.000000

std函数在ddofnumpy中有不同的默认值pandas,因此输出不同。
同时np.skew返回:
AttributeError:模块“numpy”没有属性“skew”

关于python - 一次在数据框列上应用几个功能,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/45412816/

10-12 18:36