本文介绍了查找列表单词中文本中单词的出现的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我有一个允许的单词/短语的列表:

Say I have a list of allowed words/phrases:

'Stack'
'Overflow'
'Stack Overflow'
'Stack Exchange'
'Exchange'

以及以下要解析的文本:

and the following text to parse:

'Hello, and welcome to Stack Overflow.
 Here are some words which should match: Stack, Exchange.'

我想获取在允许的列表中找到的单词列表:

I'd like to get the list of words which are found in the allowed list:

  • 堆栈溢出"
  • 堆栈"
  • 交流"

获得结果的最佳方法是什么?

What would be the best way to achieve the result?

我将使用的允许列表至少为一千个单词/短语.

The allowed list I'll be using could be at least a thousand words/phrases.

推荐答案

将单词放入列表中并在使用后

Put the words in a list and after use

def intersect(x, y):
    return list(set(x) & set(y))
word_list_text=string.split(text)
words_found={}
words_found=intersect(word_list_text, words)

这篇关于查找列表单词中文本中单词的出现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-14 22:38