我有一个数据框,如下所示。
Key|Direction
:--|-------:
x | Sell
x | Buy
x | BUY
y | Sell
y | Sell
y | Sell
Z | Buy
Z | Buy
a | Buy
a | Sell
我想做的是创建一个第三列,对于所有相同的键,如果有该键的买入和卖出,则第三列会说是。如果不是,那就说不。我当时正在使用groupby,但是我发现很难将值重新分配回数据框中。这就是我希望最后一栏看起来像的
Key|Direction |Cross
:--|------- |------
x | Sell | yes
x | Buy | yes
x | BUY | yes
y | Sell | no
y | Sell | no
y | Sell | no
Z | Buy | no
Z | Buy | no
a | Buy | yes
a | Sell | yes
最佳答案
您可以使用groupby
+ transform
来比较set
和map
的最后一个dict
:
d = {True:'yes', False:'no'}
df['Cross'] = df.groupby('Key')['Direction'] \
.transform(lambda x: set(x) == set(['Buy','Sell'])).map(d)
print (df)
Key Direction Cross
0 x Sell yes
1 x Buy yes
2 x Buy yes
3 y Sell no
4 y Sell no
5 y Sell no
6 Z Buy no
7 Z Buy no
8 a Buy yes
9 a Sell yes
另一种解决方案是用
Series
创建set
,用map
为新列创建Series
,与eq
(==
)比较,最后用dict
映射:d = {True:'yes', False:'no'}
s = df.groupby('Key')['Direction'].apply(set)
df['Cross'] = df['Key'].map(s).eq(set(['Buy','Sell'])).map(d)
print (df)
Key Direction Cross
0 x Sell yes
1 x Buy yes
2 x Buy yes
3 y Sell no
4 y Sell no
5 y Sell no
6 Z Buy no
7 Z Buy no
8 a Buy yes
9 a Sell yes
numpy.where
的类似解决方案:s = df.groupby('Key')['Direction'].apply(set)
df['Cross'] = np.where(df['Key'].map(s).eq(set(['Buy','Sell'])), 'yes', 'no')
print (df)
Key Direction Cross
0 x Sell yes
1 x Buy yes
2 x Buy yes
3 y Sell no
4 y Sell no
5 y Sell no
6 Z Buy no
7 Z Buy no
8 a Buy yes
9 a Sell yes