我有两个正在使用的数据帧,一个包含播放器列表,另一个包含来自另一个数据帧的播放器的逐播数据。下面显示了这两个数据帧中感兴趣的行的部分。

0          Matt Carpenter
1           Jason Heyward
2           Peter Bourjos
3           Matt Holliday
4          Jhonny Peralta
5              Matt Adams
...
Name: Name, dtype: object


0     Matt Carpenter grounded out to second (Grounder).
1               Jason Heyward doubled to right (Liner).
2     Matt Holliday singled to right (Liner). Jason Heyward scored.
...
Name: Play, dtype: object

我要做的是在第一个数据框中创建一个列,该列计算字符串(df['Name']+'scored')在另一个数据框的列中出现的次数。例如,它会搜索“Matt Carpenter scored”、“Jason Heyward scored”等实例。我知道您可以使用str.contains来完成这类事情,但似乎只有在您输入显式字符串时才有效。例如,
batter_game_logs_df['R vs SP'] = len(play_by_play_SP_df[play_by_play_SP_df['Play'].str.contains('Jason Heyward scored')].index)

很好,但如果我试着
batter_game_logs_df['R vs SP'] = len(play_by_play_SP_df[play_by_play_SP_df['Play'].str.contains(batter_game_logs_df['Name'].astype(str) + ' scored')].index)

它返回错误“Series”对象是可变的,因此不能对它们进行散列。我已经研究过各种类似的问题,但我一生都找不到解决这个问题的办法。如有任何帮助,将不胜感激,谢谢!

最佳答案

我认为需要通过regex将所有值连接到findall,然后通过Name创建指示符列,并通过MultiLabelBinarizer添加所有缺少的列:

s = df1['Name'] + ' scored'
pat = r'\b{}\b'.format('|'.join(s))

from sklearn.preprocessing import MultiLabelBinarizer

mlb = MultiLabelBinarizer()
df = pd.DataFrame(mlb.fit_transform(df2['Play'].str.findall(pat)),
                  columns=mlb.classes_,
                  index=df2.index).reindex(columns=s, fill_value=0)
print (df)
Name  Matt Carpenter scored  Jason Heyward scored  Peter Bourjos scored  \
0                         0                     0                     0
1                         0                     0                     0
2                         0                     1                     0

Name  Matt Holliday scored  Jhonny Peralta scored  Matt Adams scored
0                        0                      0                  0
1                        0                      0                  0
2                        0                      0                  0

如有必要,最后reindexjoin
df = df2.join(df)
print (df)
                                                Play  Matt Carpenter scored  \
0  Matt Carpenter grounded out to second (Grounder).                      0
1            Jason Heyward doubled to right (Liner).                      0
2  Matt Holliday singled to right (Liner). Jason ...                      0

   Jason Heyward scored  Peter Bourjos scored  Matt Holliday scored  \
0                     0                     0                     0
1                     0                     0                     0
2                     1                     0                     0

   Jhonny Peralta scored  Matt Adams scored
0                      0                  0
1                      0                  0
2                      0                  0

07-25 20:54