我正在尝试使用PocketSphinx on Android和keith vertanen的language models之一来实现“听写”功能。我将the sample修改为如下所示:
private void setupRecognizer(File assetsDir) throws IOException {
recognizer = defaultSetup()
.setAcousticModel(new File(assetsDir, "en-us-ptm"))
.setDictionary(new File(assetsDir, "cmudict-en-us.dict"))
.setRawLogDir(assetsDir)
.setKeywordThreshold(1e-45f)
.setBoolean("-allphone_ci", true)
.getRecognizer();
recognizer.addListener(this);
File ngramModel = new File(assetsDir, "lm_csr_5k_nvp_2gram.arpa");
recognizer.addNgramSearch(NGRAM_SEARCH, ngramModel);
其中,
lm_csr_5k_nvp_2gram.arpa
来自Keith Vertanen现场的5K NVP 2-gram负载。我得到这个错误:
1 18:04:29.861 2837-2863/? I/SpeechRecognizer: Load N-gram model /storage/emulated/0/Android/data/edu.cmu.sphinx.pocketsphinx/files/sync/lm_csr_5k_nvp_2gram.arpa
01-31 18:04:29.861 2837-2863/? I/cmusphinx: INFO: ngram_model_trie.c(399): Trying to read LM in trie binary format
01-31 18:04:29.861 2837-2863/? I/cmusphinx: INFO: ngram_model_trie.c(410): Header doesn't match
01-31 18:04:29.861 2837-2863/? I/cmusphinx: INFO: ngram_model_trie.c(177): Trying to read LM in arpa format
01-31 18:04:29.862 2837-2863/? E/cmusphinx: ERROR: "ngram_model_trie.c", line 103: Bad ngram count
01-31 18:04:29.862 2837-2863/? I/cmusphinx: INFO: ngram_model_trie.c(489): Trying to read LM in DMP format
01-31 18:04:29.862 2837-2863/? E/cmusphinx: ERROR: "ngram_model_trie.c", line 500: Wrong magic header size number a5c6461: /storage/emulated/0/Android/data/edu.cmu.sphinx.pocketsphinx/files/sync/lm_csr_5k_nvp_2gram.arpa is not a dump file
01-31 18:04:29.864 2837-2863/? E/AndroidRuntime: FATAL EXCEPTION: AsyncTask #1
Process: edu.cmu.sphinx.pocketsphinx, PID: 2837
java.lang.RuntimeException: An error occurred while executing doInBackground()
at android.os.AsyncTask$3.done(AsyncTask.java:309)
at java.util.concurrent.FutureTask.finishCompletion(FutureTask.java:354)
at java.util.concurrent.FutureTask.setException(FutureTask.java:223)
at java.util.concurrent.FutureTask.run(FutureTask.java:242)
at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:234)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1113)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:588)
at java.lang.Thread.run(Thread.java:818)
Caused by: java.lang.RuntimeException: Decoder_setLmFile returned -1
at edu.cmu.pocketsphinx.PocketSphinxJNI.Decoder_setLmFile(Native Method)
at edu.cmu.pocketsphinx.Decoder.setLmFile(Decoder.java:172)
at edu.cmu.pocketsphinx.SpeechRecognizer.addNgramSearch(SpeechRecognizer.java:247)
at edu.cmu.pocketsphinx.demo.PocketSphinxActivity.setupRecognizer(PocketSphinxActivity.java:161)
at edu.cmu.pocketsphinx.demo.PocketSphinxActivity.access$000(PocketSphinxActivity.java:50)
at edu.cmu.pocketsphinx.demo.PocketSphinxActivity$1.doInBackground(PocketSphinxActivity.java:72)
at edu.cmu.pocketsphinx.demo.PocketSphinxActivity$1.doInBackground(PocketSphinxActivity.java:66)
at android.os.AsyncTask$2.call(AsyncTask.java:295)
at java.util.concurrent.FutureTask.run(FutureTask.java:237)
at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:234)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1113)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:588)
at java.lang.Thread.run(Thread.java:818)
台词
01-31 18:04:29.861 2837-2863/? I/cmusphinx: INFO: ngram_model_trie.c(177): Trying to read LM in arpa format
01-31 18:04:29.862 2837-2863/? E/cmusphinx: ERROR: "ngram_model_trie.c", line 103: Bad ngram count
让我认为
lm_csr_5k_nvp_2gram.arpa
文件的格式不正确之类的。文件如下所示:\data\
ngram 1=5000
ngram 2=4331397
ngram 3=0
\1-grams:
-2.11154 </s> 0
-99 <s> -3.13167
-0.3954594 <unk> -0.4365645
-2.271447 a -2.953606
-3.384721 a. -1.85196
-5.788997 a.'s -0.8137056
-4.139672 abandoned -0.9728376
-3.904189 ability -1.838658
-4.360272 able -2.161723
...
它至少看起来像示例文件here。
我唯一的另一个想法是,也许扩展是错误的,因为this说
语言模型可以以三种不同的格式存储和加载:文本arpa格式、二进制bin格式和二进制dmp格式。arpa格式占用更多的空间,但可以对其进行编辑。ARPA文件的扩展名为.lm。二进制格式占用的空间明显更少,加载速度更快。二进制文件的扩展名为.lm.bin。也可以在格式之间进行转换。DMP格式已过时,不建议使用。
这听起来好像文件应该命名为
lm_csr_5k_nvp_2gram.lm
而不是lm_csr_5k_nvp_2gram.arpa
。不过,我确实尝试过重命名该文件,但没有对异常进行任何更改。正确的方法是什么?
最佳答案
好吧,这是模型格式的问题,ngram模型中的这一行导致了一个问题:
ngram 3=0
你可以删除违规行或者更新pocketshinxandroid演示程序,我刚刚发布了一个新版本,这个问题已经解决了。
总的来说,在电话听写不是小事,因为电话真的很慢。我不建议你用2克,最好用重修剪的3克模型。您可以使用SRILM进行修剪。
你也可以阅读optimization doc来学习其他的调子。