问题描述
如何从文本中提取人名?
How can I extract person names from the text?
为此,我已经应用了一些NLP工具包,特别是我使用了Stanford NER工具包从文本中提取名称.这样,我可以从文本中提取人名,但是当我希望程序提取诸如程序员",讲师"或工程师"之类的单词时,库将无法提取这些人的名字.有什么方法可以从文本中提取这些内容?
I have applied some NLP toolkit for this, specifically I used the Stanford NER toolkit to extract names from text. With that, I can extract person names from the text, but when I want the program to extract words like 'programmer', 'lecturer' or 'engineer', the libraries couldn't extract those. Is there any way to extract these from the text?
推荐答案
由于程序员,讲师和工程师"不是命名实体,因此您可能必须维护这些单词的列表.我认为您可以从Wordnet中的单词派生关系中获取它们,例如唱歌"(动词)和歌手"或演讲"(动词)和讲师"(名词).
Since "Programmer, lecturer, and engineer" are not named-entities, you may have to maintain a list of those words. I think you can obtain them from word derivation relationships in Wordnet, like "sing" (verb) and "singer" or "lecture" (verb) and "lecturer" (noun).
SuperSense标记器也可以用作NER,我认为它可以标记那些您所需要的单词是"noun.person". ArkRef (Java)是一种使用它的辅助工具(通过supersense的Java端口) tagger,捆绑销售),并且那里有一个在线演示,因此您可以检查目标词是否被标记在方括号中.
A SuperSense tagger may also be used as NER, I think it can tag those words you mentioned as "noun.person" which is what you need. ArkRef (Java) is a coreference tool that uses it (through a Java port of supersense tagger, bundled), and there's an online demo there, so you can check if your target words are tagged in square brackets.
这篇关于如何区分一个人的名字和其他来自动词的名字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!