我试图在条件下的数据帧期间创建重复的行。

例如,我有这个数据框。

team    student
 a      Ursula
 b      Hayfa, Martin
 c      Kato
 d      Tanek, Ava, Pyto
 e      Aiko
 f      Hunter
 g      Josiah, Derek, Uma, Nell


所需的输出:

  team  student                   name      remark
   a    Ursula                    Ursula
   b    Hayfa, Martin             Hayfa     with Martin
   b    Hayfa, Martin             Martin    with Hayfa
   c    Kato                      Kato
   d    Tanek, Ava, Pyto          Tanek     with Ava, Pyto
   d    Tanek, Ava, Pyto          Ava       with Tanek, Pyto
   d    Tanek, Ava, Pyto          Pyto      with Tanek, Ava
   e    Aiko                      Aiko
   f    Hunter                    Hunter
   g    Josiah, Derek, Uma, Nell  Josiah    with Derek, Uma, Nell
   g    Josiah, Derek, Uma, Nell  Derek     with Josiah, Uma, Nell
   g    Josiah, Derek, Uma, Nell  Uma       with Josiah, Derek, Nell
   g    Josiah, Derek, Uma, Nell  Nell      with Josiah, Derek, Uma

最佳答案

对于大熊猫0.25+,可以将DataFrame.explodeSeries.str.split分割的值一起使用,对于remark带有过滤功能的列列表理解:

s = df['student'].str.split(', ')
df = df.assign(name= s, remark = s).explode('name').reset_index(drop=True)
df['remark'] = ['with ' + ', '.join(x for x in b if x != a)
                if len(b) > 1
                else ''
                for a, b in zip(df['name'], df['remark'])]
print (df)
   team                   student    name                    remark
0     a                    Ursula  Ursula
1     b             Hayfa, Martin   Hayfa               with Martin
2     b             Hayfa, Martin  Martin                with Hayfa
3     c                      Kato    Kato
4     d          Tanek, Ava, Pyto   Tanek            with Ava, Pyto
5     d          Tanek, Ava, Pyto     Ava          with Tanek, Pyto
6     d          Tanek, Ava, Pyto    Pyto           with Tanek, Ava
7     e                      Aiko    Aiko
8     f                    Hunter  Hunter
9     g  Josiah, Derek, Uma, Nell  Josiah     with Derek, Uma, Nell
10    g  Josiah, Derek, Uma, Nell   Derek    with Josiah, Uma, Nell
11    g  Josiah, Derek, Uma, Nell     Uma  with Josiah, Derek, Nell
12    g  Josiah, Derek, Uma, Nell    Nell   with Josiah, Derek, Uma

关于python - Pandas -重复的行和切片字符串,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/58536958/

10-09 03:08