尝试创建正则表达式,以在一个长字符串中以及关键字没有被字母包围时找到关键字。如果字符串被短划线或下划线包围,只要它不被字母包围。只需要找到一个出现的单词,就考虑匹配。只关心在一个长字符串中找到它。目前,当单词旁边有'_'时,我无法使其变为True。有更好表达的想法吗?
编辑-我找到了一个需要真实的情况,没有将其添加到示例中。
import re
key_words = ['go', 'at', 'why', 'stop' ]
false_match = ['going_get_that', 'that_is_wstop', 'whysper','stoping_tat' ]
positive_match = ['go-around', 'go_at_going','stop-by_the_store', 'stop','something-stop', 'something_stop']
pattern = r"\b(%s)\b" % '|'.join(key_words)
for word in false_match + positive_match:
if re.match(pattern,word):
print True, word
else:
print False, word
电流输出:
False going_get_that
False that_is_wstop
False whysper
False stoping_tat
True go-around
False go_at_going
True stop-by_the_store
True stop
编辑-这必须为True
False something-stop
False something_stop
所需的输出:
False going_get_that
False that_is_wstop
False whysper
False stoping_tat
True go-around
True go_at_going
True stop-by_the_store
True stop
True something-stop
True something_stop
最佳答案
使用否定的外观(向前或向后):
import re
key_words = ['go', 'at', 'why', 'stop' ]
false_match = ['going_get_that', 'that_is_wstop', 'whysper','stoping_tat' ]
positive_match = ['go-around', 'go_at_going','stop-by_the_store', 'stop', 'something-stop', 'something_stop']
pattern = r"(?<![a-zA-Z])(%s)(?![a-zA-Z])" % '|'.join(key_words)
for word in false_match + positive_match:
if re.search(pattern,word):
print True, word
else:
print False, word
关于python - 查找关键字符之间的字型,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/29061479/