本文介绍了保留以某些文本字符串开头的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

背景

我有以下df

import pandas as pd
df = pd.DataFrame({'Text' : ['\n[SPORTS FAN]\nHere',
                                   'Nothing here',
                                 '\n[BASEBALL]\nTHIS SOUNDS right',
                                 '\n[SPORTS FAN]\nLikes sports',
                                   'Nothing is here',
                                 '\n[NOT SPORTS]\nTHIS SOUNDS good',
                                 '\n[SPORTS FAN]\nReally Big big fan',
                                  '\n[BASEBALL]\nRARELY IS a fan'
                                ],

                          'P_ID': [1,2,3,4,5,6,7,8],
                          'P_Name' : ['J J SMITH',
                                      'J J SMITH',
                                      'J J SMITH',
                                      'J J SMITH',
                                      'MARY HYDER',
                                      'MARY HYDER',
                                      'MARY HYDER',
                                      'MARY HYDER']
                         })

输出

P_ID    P_Name      Text
0   1   J J SMITH   \n[SPORTS FAN]\nHere
1   2   J J SMITH   Nothing here
2   3   J J SMITH   \n[BASEBALL]\nTHIS SOUNDS right
3   4   J J SMITH   \n[SPORTS FAN]\nLikes sports
4   5   MARY HYDER  Nothing is here
5   6   MARY HYDER  \n[NOT SPORTS]\nTHIS SOUNDS good
6   7   MARY HYDER  \n[SPORTS FAN]\nReally Big big fan
7   8   MARY HYDER  \n[BASEBALL]\nRARELY IS a fan

目标

保留以'\n[SPORTS FAN]\\n[BASEBALL]\n

所需的输出

P_ID    P_Name      Text
0   1   J J SMITH   \n[SPORTS FAN]\nHere
2   3   J J SMITH   \n[BASEBALL]\nTHIS SOUNDS right
3   4   J J SMITH   \n[SPORTS FAN]\nLikes sports
6   7   MARY HYDER  \n[SPORTS FAN]\nReally Big big fan
7   8   MARY HYDER  \n[BASEBALL]\nRARELY IS a fan

问题

如何实现所需的输出?

推荐答案

尝试一下:

df_new = df.loc[df['Text'].str.startswith('\n[SPORTS FAN]') | df['Text'].str.startswith('\n[BASEBALL]')]

不需要正则表达式

这篇关于保留以某些文本字符串开头的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-28 06:21