问题描述
POS标签之间的不一致结果
Inconsistent results of POS tagging between
P: http://nlp.stanford.edu:8080/parser/
和
C: http://nlp.stanford.edu:8080/corenlp/process
例如,
C:我们去了东部/JJ到奥斯陆.P:我们去了东/RB去了奥斯陆.
C: We went east/JJ to Oslo.P: We went east/RB to Oslo.
C:我们所有人/DT都在变老.P:我们所有人/RB都在变老.
C: We are all/DT getting older.P: We are all/RB getting older.
C:您是否对假期感到兴奋/VBN?P:您对假期感到兴奋吗?
C: Are you getting excited/VBN about your vacation?P: Are you getting excited/JJ about your vacation?
C:您这样做吗/VBP?P:你是这样做的吗?
C: Did you do/VBP that?P: Did you do/VB that?
似乎解析器的性能要优于核心nlp,但是我无法通过在核心nlp zip文件中提供的模型之间进行切换来复制解析器的结果.
It seems that the parser performs better than core nlp, but I cannot replicate the parser results by switching between the model provided in the core nlp zip file.
有什么主意吗?
推荐答案
如果使用以下不同的管道,您将获得语音标签结果的不同部分:
You will get different part of speech tag results if you use these different pipelines:
tokenize,ssplit,pos,lemma,parse
vs.
tokenize,ssplit,parse
后者将在分析过程中执行语音标记的一部分.前者使用专用于部分语音标记的MEMM序列标记模型.
The latter will perform part of speech tagging as part of the parsing process. The former uses the MEMM sequence tagging model that is dedicated to part of speech tagging.
这篇关于核心nlp演示和解析器演示之间的POS标记结果不一致的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!