本文介绍了Python NLTK:Stanford NER 标记器错误消息:NLTK 无法找到 java 文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

试图让斯坦福 NER 使用 Python.按照网络上的一些说明进行操作,但收到错误消息:NLTK 无法找到 java 文件!使用软件特定的配置参数或设置 JAVAHOME 环境变量." 有什么问题?谢谢!

from nltk.tag.stanford import StanfordNERTagger从 nltk.tokenize 导入 word_tokenize模型 = r'C:\Stanford\NER\classifiers\english.muc.7class.distsim.crf.ser.gz'jar = r'C:\Stanford\NER\stanford-ner-3.9.1.jar'ner_tagger = StanfordNERTagger(模型,jar,编码 = 'utf-8')text = '而在法国,拉加德讨论了短期刺激措施' \最近在接受《华尔街日报》采访时所做的努力."单词 = word_tokenize(文本)分类词 = ner_tagger.tag(词)
解决方案

在网上找到了解决方案.将路径替换为您自己的路径.

 导入操作系统java_path = "C:/../../jdk1.8.0_101/bin/java.exe"os.environ['JAVAHOME'] = java_path

或:

import nltknltk.internals.config_java('C:/../../jdk1.8.0_101/bin/java.exe')

来源:https://tianyouhu.wordpress.com/2016/09/01/problem-of-nltk-with-stanfordtokenizer/

Trying to get Stanford NER working with Python. Followed some instructions on the web, but got the error message: "NLTK was unable to find the java file!Use software specific configuration paramaters or set the JAVAHOME environment variable." What was wrong? Thank you!

from nltk.tag.stanford import StanfordNERTagger
from nltk.tokenize import word_tokenize

model = r'C:\Stanford\NER\classifiers\english.muc.7class.distsim.crf.ser.gz'
jar = r'C:\Stanford\NER\stanford-ner-3.9.1.jar'

ner_tagger = StanfordNERTagger(model, jar, encoding = 'utf-8')

text = 'While in France, Christine Lagarde discussed short-term stimulus ' \
       'efforts in a recent interview with the Wall Street Journal.'

words = word_tokenize(text)
classified_words = ner_tagger.tag(words)
解决方案

Found the solution on the web. Replace the path with your own.

or:

Source: https://tianyouhu.wordpress.com/2016/09/01/problem-of-nltk-with-stanfordtokenizer/

这篇关于Python NLTK:Stanford NER 标记器错误消息:NLTK 无法找到 java 文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-13 19:13