我正在尝试使用Stanford NLP库的CRFClassifier。
训练有素的模型应该在.ser文件中,但是当我将反序列化的对象传递给CRFClassifier构造函数时,会出现错误:
java.lang.NoSuchFieldError: maxAdditionalKnownLCWords
这是我尝试过的方法,我也尝试过使用同一目录中提供的属性文件。无论是否传递prop文件,我都会收到相同的错误:
import edu.stanford.nlp.process.*;
import java.util.Collection;
import edu.stanford.nlp.ling.*;
import java.util.List;
import java.io.*;
import edu.stanford.nlp.io.*;
import edu.stanford.nlp.ie.*;
import edu.stanford.nlp.ie.crf.*;
import java.util.*;
public class StanfordParserTest {
public static void main(String[] args) {
// TODO Auto-generated method stub
String propfile = "/Users/--------/Documents/Programming/Java/stanford-ner-2015-12-09/classifiers/english.all.3class.distsim.prop";
FileReader p_file_reader = null;
Properties prop = new Properties();
try{
p_file_reader = new FileReader(propfile);
}catch(FileNotFoundException e){
e.printStackTrace();
}
if (p_file_reader != null){
try{
prop.load(p_file_reader);
p_file_reader.close();
}catch(IOException e){
e.printStackTrace();
}
}
ObjectInputStream o_in = null;
String serializedClassifier = "/Users/--------/Documents/Programming/Java/stanford-ner-2015-12-09/classifiers/english.all.3class.distsim.crf.ser";
try{
FileInputStream f_in = new FileInputStream(serializedClassifier);
o_in = new ObjectInputStream(f_in);
f_in.close();
}catch(FileNotFoundException e){
e.printStackTrace();
}catch(IOException e){
e.printStackTrace();
}
System.out.println(o_in);
System.out.println(prop);
AbstractSequenceClassifier<CoreLabel> classifier = null;
try{
classifier = CRFClassifier.getClassifier(o_in, prop);
}catch(ClassNotFoundException e){
e.printStackTrace();
}
catch(IOException e){
e.printStackTrace();
}
System.out.println(classifier);
}
}
这是输出:
java.io.ObjectInputStream@6ff3c5b5
{useDisjunctive=true, useSequences=true, serializeTo=english.all.3class.distsim.crf.ser.gz, useOccurrencePatterns=true, unknownWordDistSimClass=0, useClassFeature=true, testFile=/u/nlp/data/ner/column_data/all.3class.test, useQN=true, useTypeSeqs=true, usePrevSequences=true, featureDiffThresh=0.05, wordFunction=edu.stanford.nlp.process.AmericanizeFunction, distSimLexicon=/u/nlp/data/pos_tags_are_useless/egw4-reut.512.clusters, wordShape=chris2useLC, usePrev=true, maxLeft=1, useNextRealWord=true, useTypeSeqs2=true, map=word=0,answer=1, disjunctionWidth=5, useWord=true, QNsize=25, useLastRealWord=true, numberEquivalenceDistSim=true, useDistSim=true, useNGrams=true, saveFeatureIndexToDisk=true, useLongSequences=true, useObservedSequencesOnly=true, readerAndWriter=edu.stanford.nlp.sequences.ColumnDocumentReaderAndWriter, maxNGramLeng=6, normalize=true, trainFileList=/u/nlp/data/ner/column_data/ace23.3class.train,/u/nlp/data/ner/column_data/muc6.3class.ptb.train,/u/nlp/data/ner/column_data/muc7.3class.ptb.train,/u/nlp/data/ner/column_data/conll.3class.train,/u/nlp/data/ner/column_data/wikiner.3class.train,/u/nlp/data/ner/column_data/ontonotes.3class.train,/u/nlp/data/ner/column_data/english.extra.3class.train, useNext=true, noMidNGrams=true, useTypeySequences=true, type=crf}
Exception in thread "main" java.lang.NoSuchFieldError: maxAdditionalKnownLCWords
at edu.stanford.nlp.ie.AbstractSequenceClassifier.reinit(AbstractSequenceClassifier.java:185)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.<init>(AbstractSequenceClassifier.java:152)
at edu.stanford.nlp.ie.crf.CRFClassifier.<init>(CRFClassifier.java:174)
at edu.stanford.nlp.ie.crf.CRFClassifier.getClassifier(CRFClassifier.java:2967)
at StanfordParserTest.main(StanfordParserTest.java:66)
有人知道这里出了什么问题吗?
最佳答案
请查阅NERDemo.java中提供的代码,以了解如何以编程方式加载CRFClassifier。
如果在分发目录中运行,这些命令应该可以正常运行:
javac -cp "*" NERDemo.java
java -mx400m -cp "*:.:lib/*" NERDemo classifiers/english.all.3class.distsim.crf.ser.gz sample.txt
通常,请确保您的CLASSPATH仅使用该分发目录中的当前jar。如果您的CLASSPATH中有过时的jar,则可能会出现一些错误。
如果您具有正确的CLASSPATH,这应该可以工作:
String serializedClassifier = "classifiers/english.all.3class.distsim.crf.ser.gz";
AbstractSequenceClassifier<CoreLabel> classifier = CRFClassifier.getClassifier(serializedClassifier);
并反序列化分类器文件夹中当前分发版本随附的模型。