问题描述
我正在尝试使用Apache OpenNLP 1.7构建自定义NER.从可用的文档中此处,我开发了以下代码
I am trying to build a custom NER using Apache OpenNLP 1.7. From the documentation available Here, I have developed the following code
import java.io.BufferedOutputStream;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.charset.Charset;
import opennlp.tools.namefind.NameFinderME;
import opennlp.tools.namefind.NameSample;
import opennlp.tools.namefind.NameSampleDataStream;
import opennlp.tools.namefind.TokenNameFinderFactory;
import opennlp.tools.namefind.TokenNameFinderModel;
import opennlp.tools.util.ObjectStream;
import opennlp.tools.util.PlainTextByLineStream;
import opennlp.tools.util.TrainingParameters;
public class PersonClassifierTrainer {
static String modelFile = "/opt/NLP/data/en-ner-customperson.bin";
public static void main(String[] args) throws IOException {
Charset charset = Charset.forName("UTF-8");
**ObjectStream<String> lineStream = new PlainTextByLineStream(new FileInputStream("/opt/NLP/data/person.train"), charset);**
ObjectStream<NameSample> sampleStream = new NameSampleDataStream(lineStream);
TokenNameFinderModel model;
TokenNameFinderFactory nameFinderFactory = null;
try {
model = NameFinderME.train("en", "person", sampleStream, TrainingParameters.defaultParams(),
nameFinderFactory);
} finally {
sampleStream.close();
}
BufferedOutputStream modelOut = null;
try {
modelOut = new BufferedOutputStream(new FileOutputStream(modelFile));
model.serialize(modelOut);
} finally {
if (modelOut != null)
modelOut.close();
}
}
}
上面突出显示的代码显示-将广播参数'file'转换为'insputstreamfactory'
The code highlighted above, shows - 'Cast argument 'file' to 'insputstreamfactory'
我被迫强制执行此操作,因为否则会显示错误.
I am forced to cast this, because it shows error otherwise.
现在,当我运行代码时,出现以下错误
Now when I run my code, I get the following error
java.io.FileInputStream cannot be cast to opennlp.tools.util.InputStreamFactory
这里缺少什么吗?
Person.train文件具有此数据
Edit 1: Person.train file has this data
<START:person> Hardik <END> is a software Professional.<START:person> Hardik works at company<END> and <START:person> is part of development team<END>. <START:person> Hardik<END> lives in New York
<START:person> Hardik<END> loves R statistical software
<START:person> Hardik<END> is a student at ISB
<START:person> Hardik<END> loves nature
Edit2:我现在得到空指针异常,有帮助吗?
I am now getting null pointer exception, any help?
推荐答案
您需要InputStreamFactory
的实例,该实例将检索您的InputStream
.此外,TokenNameFinderFactory
不得为null
.
You need an instance of InputStreamFactory
which will retrieve your InputStream
. Additionally, TokenNameFinderFactory
must not be null
.
public class PersonClassifierTrainer {
static String modelFile = "/opt/NLP/data/en-ner-customperson.bin";
public static void main(String[] args) throws IOException {
InputStreamFactory isf = new InputStreamFactory() {
public InputStream createInputStream() throws IOException {
return new FileInputStream("/opt/NLP/data/person.train");
}
};
Charset charset = Charset.forName("UTF-8");
ObjectStream<String> lineStream = new PlainTextByLineStream(isf, charset);
ObjectStream<NameSample> sampleStream = new NameSampleDataStream(lineStream);
TokenNameFinderModel model;
TokenNameFinderFactory nameFinderFactory = new TokenNameFinderFactory();
try {
model = NameFinderME.train("en", "person", sampleStream, TrainingParameters.defaultParams(),
nameFinderFactory);
} finally {
sampleStream.close();
}
BufferedOutputStream modelOut = null;
try {
modelOut = new BufferedOutputStream(new FileOutputStream(modelFile));
model.serialize(modelOut);
} finally {
if (modelOut != null)
modelOut.close();
}
}
}
Person.train文件具有此数据
Edit 1: Person.train file has this data
<START:person> Hardik <END> is a software Professional.<START:person> Hardik works at company<END> and <START:person> is part of development team<END>. <START:person> Hardik<END> lives in New York
<START:person> Hardik<END> loves R statistical software
<START:person> Hardik<END> is a student at ISB
<START:person> Hardik<END> loves nature
这篇关于Apache OpenNLP:无法将java.io.FileInputStream强制转换为opennlp.tools.util.InputStreamFactory的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!