python - Stanford NER未标记日期和时间

我在python中使用Stanford NER标记器。它没有标记日期和时间。而是在每个字上都返回O。
我的句子是：

“多少钱将在三年内以每年12％的利率赚取$ 162的利息”

标记后得到的结果是-

[('What', 'O'), ('sum', 'O'), ('of', 'O'), ('money', 'O'), ('will', 'O'), ('earn', 'O'), ('an', 'O'), ('interest', 'O'), ('of', 'O'), ('$', 'O'), ('162', 'O'), ('in', 'O'), ('3', 'O'), ('years', 'O'), ('at', 'O'), ('the', 'O'), ('rate', 'O'), ('of', 'O'), ('12%', 'O'), ('per', 'O'), ('annum', 'O')]

如何解决？

最佳答案

下载并安装Stanford NLP Group的Python库stanza。

GitHub：https://github.com/stanfordnlp/stanza
使用Stanford CoreNLP 3.7.0，启动服务器：

命令：java -Xmx4g edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000

斯坦福大学CoreNLP 3.7.0：https://stanfordnlp.github.io/CoreNLP/download.html

（注意：确保CLASSPATH包含下载文件夹中的所有jar。）
向步骤2中启动的Java Stanford CoreNLP服务器发出请求：

from stanza.nlp.corenlp import CoreNLPClient

client = CoreNLPClient(server='http://localhost:9000', default_annotators=['ssplit', 'tokenize', 'lemma', 'pos', 'ner'])

annotated = client.annotate("..text to annotate...")

for sentence in annotated.sentences:
  print "---"
  print sentence.tokens
  print sentence.ner_tags

我们正在努力让Python库处理启动和停止Stanford CoreNLP 3.8.0的服务器。

关于python - Stanford NER未标记日期和时间，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/43533701/