我有一个数据框,如下所示。

Key|Direction
:--|-------:
x  | Sell
x  | Buy
x  | BUY
y  | Sell
y  | Sell
y  | Sell
Z  | Buy
Z  | Buy
a  | Buy
a  | Sell


我想做的是创建一个第三列,对于所有相同的键,如果有该键的买入和卖出,则第三列会说是。如果不是,那就说不。我当时正在使用groupby,但是我发现很难将值重新分配回数据框中。这就是我希望最后一栏看起来像的

Key|Direction |Cross
:--|-------   |------
x  | Sell     | yes
x  | Buy      | yes
x  | BUY      | yes
y  | Sell     | no
y  | Sell     | no
y  | Sell     | no
Z  | Buy      | no
Z  | Buy      | no
a  | Buy      | yes
a  | Sell     | yes

最佳答案

您可以使用groupby + transform来比较setmap的最后一个dict

d = {True:'yes', False:'no'}
df['Cross'] = df.groupby('Key')['Direction'] \
                .transform(lambda x: set(x) == set(['Buy','Sell'])).map(d)
print (df)
  Key Direction Cross
0   x      Sell   yes
1   x       Buy   yes
2   x       Buy   yes
3   y      Sell    no
4   y      Sell    no
5   y      Sell    no
6   Z       Buy    no
7   Z       Buy    no
8   a       Buy   yes
9   a      Sell   yes


另一种解决方案是用Series创建set,用map为新列创建Series,与eq==)比较,最后用dict映射:

d = {True:'yes', False:'no'}
s = df.groupby('Key')['Direction'].apply(set)
df['Cross'] = df['Key'].map(s).eq(set(['Buy','Sell'])).map(d)
print (df)
  Key Direction Cross
0   x      Sell   yes
1   x       Buy   yes
2   x       Buy   yes
3   y      Sell    no
4   y      Sell    no
5   y      Sell    no
6   Z       Buy    no
7   Z       Buy    no
8   a       Buy   yes
9   a      Sell   yes




numpy.where的类似解决方案:

s = df.groupby('Key')['Direction'].apply(set)
df['Cross'] = np.where(df['Key'].map(s).eq(set(['Buy','Sell'])), 'yes', 'no')
print (df)
  Key Direction Cross
0   x      Sell   yes
1   x       Buy   yes
2   x       Buy   yes
3   y      Sell    no
4   y      Sell    no
5   y      Sell    no
6   Z       Buy    no
7   Z       Buy    no
8   a       Buy   yes
9   a      Sell   yes

07-26 05:24