本文介绍了保留以某些文本字符串开头的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
背景
我有以下df
import pandas as pd
df = pd.DataFrame({'Text' : ['\n[SPORTS FAN]\nHere',
'Nothing here',
'\n[BASEBALL]\nTHIS SOUNDS right',
'\n[SPORTS FAN]\nLikes sports',
'Nothing is here',
'\n[NOT SPORTS]\nTHIS SOUNDS good',
'\n[SPORTS FAN]\nReally Big big fan',
'\n[BASEBALL]\nRARELY IS a fan'
],
'P_ID': [1,2,3,4,5,6,7,8],
'P_Name' : ['J J SMITH',
'J J SMITH',
'J J SMITH',
'J J SMITH',
'MARY HYDER',
'MARY HYDER',
'MARY HYDER',
'MARY HYDER']
})
输出
P_ID P_Name Text
0 1 J J SMITH \n[SPORTS FAN]\nHere
1 2 J J SMITH Nothing here
2 3 J J SMITH \n[BASEBALL]\nTHIS SOUNDS right
3 4 J J SMITH \n[SPORTS FAN]\nLikes sports
4 5 MARY HYDER Nothing is here
5 6 MARY HYDER \n[NOT SPORTS]\nTHIS SOUNDS good
6 7 MARY HYDER \n[SPORTS FAN]\nReally Big big fan
7 8 MARY HYDER \n[BASEBALL]\nRARELY IS a fan
目标
保留以'\n[SPORTS FAN]\
和\n[BASEBALL]\n
所需的输出
P_ID P_Name Text
0 1 J J SMITH \n[SPORTS FAN]\nHere
2 3 J J SMITH \n[BASEBALL]\nTHIS SOUNDS right
3 4 J J SMITH \n[SPORTS FAN]\nLikes sports
6 7 MARY HYDER \n[SPORTS FAN]\nReally Big big fan
7 8 MARY HYDER \n[BASEBALL]\nRARELY IS a fan
问题
如何实现所需的输出?
推荐答案
尝试一下:
df_new = df.loc[df['Text'].str.startswith('\n[SPORTS FAN]') | df['Text'].str.startswith('\n[BASEBALL]')]
不需要正则表达式
这篇关于保留以某些文本字符串开头的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!