Pocketsphinx - 完善热词检测

本文介绍了Pocketsphinx - 完善热词检测的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我最近重新访问了 CMU Sphinx 并尝试设置一个基本的热词检测器对于 Android，从教程开始并改编示例应用程序.

I've revisited CMU Sphinx recently and attempted to set up a basic hot-word detector for Android, starting from the tutorial and adapting the sample application.

我遇到了各种各样的问题，尽管我深入研究了他们的文档，但我一直无法解决这些问题，直到我再也读不下去了...

I'm having various issues, which I've been unable to resolve, despite delving deep into their documentation, until I can read no more...

为了复制它们，我做了一个基本项目，旨在检测关键字wakeup you 和wakeup me.

In order to replicate them, I made a basic project that was designed to detect the keywords wakeup you and wakeup me.

我的字典:

me M IY
wakeup W EY K AH P
you Y UW

我的语言模型:

\data\
ngram 1=5
ngram 2=5
ngram 3=4

\1-grams:
-0.9031 </s> -0.3010
-0.9031 <s> -0.2430
-1.2041 me -0.2430
-0.9031 wakeup -0.2430
-1.2041 you -0.2430

\2-grams:
-0.3010 <s> wakeup 0.0000
-0.3010 me </s> -0.3010
-0.6021 wakeup me 0.0000
-0.6021 wakeup you 0.0000
-0.3010 you </s> -0.3010

\3-grams:
-0.6021 <s> wakeup me
-0.6021 <s> wakeup you
-0.3010 wakeup me </s>
-0.3010 wakeup you </s>

\end\

以上两个都是使用推荐的工具.

还有我的关键短语文件:

And my key-phrases file:

wakeup you /1e-20/
wakeup me /1e-20/

调整上面链接的示例应用程序，这是我的代码:

Adapting the example application linked above, here is my code:

public class PocketSphinxActivity extends Activity implements RecognitionListener {

    private static final String CLS_NAME = PocketSphinxActivity.class.getSimpleName();

    private static final String HOTWORD_SEARCH = "hot_words";

    private volatile SpeechRecognizer recognizer;

    @Override
    public void onCreate(Bundle state) {
        super.onCreate(state);
        setContentView(R.layout.main);

        new AsyncTask<Void, Void, Exception>() {
            @Override
            protected Exception doInBackground(Void... params) {
                Log.i(CLS_NAME, "doInBackground");

                try {

                    final File assetsDir = new Assets(PocketSphinxActivity.this).syncAssets();

                    recognizer = defaultSetup()
                            .setAcousticModel(new File(assetsDir, "en-us-ptm"))
                            .setDictionary(new File(assetsDir, "basic.dic"))
                            .setKeywordThreshold(1e-20f)
                            .setBoolean("-allphone_ci", true)
                            .setFloat("-vad_threshold", 3.0)
                            .getRecognizer();

                    recognizer.addNgramSearch(HOTWORD_SEARCH, new File(assetsDir, "basic.lm"));
                    recognizer.addKeywordSearch(HOTWORD_SEARCH, new File(assetsDir, "hotwords.txt"));
                    recognizer.addListener(PocketSphinxActivity.this);

                } catch (final IOException e) {
                    Log.e(CLS_NAME, "doInBackground IOException");
                    return e;
                }

                return null;
            }

            @Override
            protected void onPostExecute(final Exception e) {
                Log.i(CLS_NAME, "onPostExecute");

                if (e != null) {
                    e.printStackTrace();
                } else {
                    recognizer.startListening(HOTWORD_SEARCH);
                }
            }
        }.execute();
    }

    @Override
    public void onBeginningOfSpeech() {
        Log.i(CLS_NAME, "onBeginningOfSpeech");
    }

    @Override
    public void onPartialResult(final Hypothesis hypothesis) {
        Log.i(CLS_NAME, "onPartialResult");

        if (hypothesis == null)
            return;

        final String text = hypothesis.getHypstr();
        Log.i(CLS_NAME, "onPartialResult: text: " + text);

    }

    @Override
    public void onResult(final Hypothesis hypothesis) {
        // unused
        Log.i(CLS_NAME, "onResult");
    }

    @Override
    public void onEndOfSpeech() {
        // unused
        Log.i(CLS_NAME, "onEndOfSpeech");
    }


    @Override
    public void onError(final Exception e) {
        Log.e(CLS_NAME, "onError");
        e.printStackTrace();
    }

    @Override
    public void onTimeout() {
        Log.i(CLS_NAME, "onTimeout");
    }

    @Override
    public void onDestroy() {
        super.onDestroy();
        Log.i(CLS_NAME, "onDestroy");

        recognizer.cancel();
        recognizer.shutdown();
    }
}

注意:- 我是否应该将我选择的关键短语(和其他相关文件)更改为更加不同，并且我在安静的环境中测试实现，应用的设置和阈值工作非常成功.

Note:- Should I alter my selected key-phrases (and other related files) to be more dissimilar and I test the implementation in a quiet environment, the setup and thresholds applied work very successfully.

问题

当我说 wakeup you 或 wakeup me 时，两者都会被检测到.

When I say either wakeup you or wakeup me, both will be detected.

我无法确定如何对结尾音节应用增加的权重.

I can't establish how to apply an increased weighting to the end syllables.

当我说只是唤醒时，通常(但不总是)两者都会被检测到.

When I say just wakeup, often (but not always) both will be detected.

我无法确定如何避免这种情况发生.

I can't establish how I can avoid this occurring.

在针对背景噪声进行测试时，误报过于频繁.

我无法降低我正在使用的基本阈值，否则在正常情况下无法始终如一地检测到关键短语.

I can't lower the base thresholds I am using, otherwise the keyphrases are not detected consistently under normal conditions.

针对背景噪声进行长时间测试(5 分钟应该足以复制)时，立即返回安静的环境并说出关键短语，结果没有检测到.

成功并重复检测关键短语需要一段不确定的时间 - 就好像测试是在安静的环境中开始的一样.

It takes an undetermined period of time before the keyphrases are detected successfully and repeatedly - as though the test had begun in a quiet environment.

我发现了一个可能相关的问题，但这些链接不再有效.我想知道我是否应该更频繁地重置识别器，以便以某种方式将背景噪声重置为检测阈值的平均值?

I found a potentially related question, but the links no longer work. I wonder if I should be resetting the recogniser more frequently, so to somehow reset the background noise from being averaged into the detection thresholds?

最后，我想知道我对有限关键词的要求是否可以让我减小声学模型的大小?

在我的应用程序中打包时的任何开销当然是有益的.

Any overhead when packaging within my application would of course be beneficial.

最后(老实说！)，特别希望 @NikolayShmyrev 会发现这个问题，有没有计划完全通过 gradle 包装一个基本的 Android 实现/sdk?

Very finally (honest!), and specifically hoping that @NikolayShmyrev will spot this question, are there any plans to wrap a base Android implementation/sdk entirely via gradle?

感谢那些走到今天的人......

I thank you to those who made it this far...

完善热词检测

Pocketsphinx - 完善热词检测

问题描述

推荐答案