我正在使用 System.Speech.Recognition
命名空间来识别口语。我对识别器提供的替代句子及其置信度分数感兴趣。从 [RecognitionResult.Alternates][1]
属性的文档中:
然而,当我打印识别的文本时,以及替代匹配与他们的信任时,我面临两个我无法理解的属性:首先,替代选项不是根据置信度排序的(尽管第一个确实与识别的匹配文本),其次,这对我来说是一个更大的问题,识别出的文本不是得分最高的替代方案,这似乎与我上面引用的文档相矛盾。
我的(不完整的)代码示例来自 SpeechRecognized
事件处理程序:
Console.WriteLine("Recognized text = {0}, score = {1}", e.Result.Text, e.Result.Confidence);
// Display the recognition alternates for the result.
foreach (RecognizedPhrase phrase in e.Result.Alternates)
{
Console.WriteLine(" alt({0}) {1}", phrase.Confidence, phrase.Text);
}
和相应的输出:
Recognized text = She had said that fit and Gracie Wachtel are all year, score = 0.287724
alt(0.287724) She had said that fit and Gracie Wachtel are all year
alt(0.287724) she had said that fit and gracie wachtel are all year
alt(0.2955212) she had said that faith and gracie wachtel are all year
alt(0.287133) she had said that fit and gracie Wachtell are all year
alt(0.1644379) she had said that fit and gracie wachtel earlier
alt(0.3254312) jihad said that fit and gracie wachtel are all year
alt(0.2726361) she had said that fit and gracie wachtel are only are
alt(0.2867217) she had said that fail and gracie wachtel are all year
alt(0.2565451) she had said that fit and gracie watchful are all year
alt(0.2854537) she had said that fate and gracie wachtel are all year
EDIT 为了阐明置信度分数的含义,并说明为什么我的结果与文档相矛盾,请参阅
RecognizedPhrase.Confidence Property
文档中的以下信息。粗体部分是我的补充:最佳答案
我只能给你一个笼统的答案(我不知道微软语音识别的代码)
识别使用许多算法来接近最佳解决方案。在一个完美的世界中,每个算法都应该能够对转换后的句子的置信度进行加权。事实上,几乎从来没有这样的情况:
每种算法都有缺陷,并且对转换的置信度产生确切影响可能是一个真正令人头疼的问题。
全局句子置信度是它每个部分的算术组合。通常比内部置信模式简单得多。
使用的一些算法,如专有名词识别,不一定会明显改变置信度(特别是在单个孤立句子中)。
置信度在多个层面(语音、单词、句子结构……)进行测量,如果句子结构不一致,那么完美的语音识别的置信度是多少?
在列表顶部移动更好识别的排序算法通常不会改变置信度,而只会排序/排除替代项。
所以文档是正确的,无法在替代品之间比较置信度。
信心的潜在用途是什么(除了作者想告诉我们的事实:我们可以为您提供一个非常复杂和近似的技术的简单用法)。几乎没有。您可能可以消除太低的置信水平(低于某个阈值),除非没有置信度达到此阈值。
关于c# - System.Speech.Recognition 替代匹配和置信度值,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/36965176/