问题描述
我目前工作的需要语音连接codeD传输到特定的音频格式的应用程序。
I'm currently working on an application which requires transmission of speech encoded to a specific audio format.
System.Speech.AudioFormat.SpeechAudioFormatInfo synthFormat =
new System.Speech.AudioFormat.SpeechAudioFormatInfo(System.Speech.AudioFormat.EncodingFormat.Pcm,
8000, 16, 1, 16000, 2, null);
这规定,音频是在PCM格式,每秒8000个样本,每个样本16位,单声道,每秒16000平均字节数,2块对齐。
This states that the audio is in PCM format, 8000 samples per second, 16 bits per sample, mono, 16000 average bytes per second, block alignment of 2.
当我尝试执行以下code有什么写给我的MemoryStream实例;然而,当我从高达每秒8000采样更改为11025的音频数据被成功写入。
When I attempt to execute the following code there is nothing written to my MemoryStream instance; however when I change from 8000 samples per second up to 11025 the audio data is written successfully.
SpeechSynthesizer synthesizer = new SpeechSynthesizer();
waveStream = new MemoryStream();
PromptBuilder pbuilder = new PromptBuilder();
PromptStyle pStyle = new PromptStyle();
pStyle.Emphasis = PromptEmphasis.None;
pStyle.Rate = PromptRate.Fast;
pStyle.Volume = PromptVolume.ExtraLoud;
pbuilder.StartStyle(pStyle);
pbuilder.StartParagraph();
pbuilder.StartVoice(VoiceGender.Male, VoiceAge.Teen, 2);
pbuilder.StartSentence();
pbuilder.AppendText("This is some text.");
pbuilder.EndSentence();
pbuilder.EndVoice();
pbuilder.EndParagraph();
pbuilder.EndStyle();
synthesizer.SetOutputToAudioStream(waveStream, synthFormat);
synthesizer.Speak(pbuilder);
synthesizer.SetOutputToNull();
有没有使用8000的采样率时,我无法找到有关SetOutputToAudioStream任何文档中有用,为什么它成功以每秒11025个样本,而不是8000我有涉及WAV一种变通方法记录异常或错误我生成的文件,并使用一些声音编辑工具转换为正确的采样率,但我想生成应用程序中的音频,如果我能。
There are no exceptions or errors recorded when using a sample rate of 8000 and I couldn't find anything useful in the documentation regarding SetOutputToAudioStream and why it succeeds at 11025 samples per second and not 8000. I have a workaround involving a wav file that I generated and converted to the correct sample rate using some sound editing tools, but I would like to generate the audio from within the application if I can.
一个感兴趣的特定点是,SpeechRecognitionEngine接受音频格式,并成功识别出的语音在我的合成波文件...
One particular point of interest was that the SpeechRecognitionEngine accepts that audio format and successfully recognized the speech in my synthesized wave file...
更新:最近发现,这种音频格式成功对某些安装的声音,但没有别人。它专门为LH Michael和LH米歇尔失败,和失败在PromptBuilder界定的某些声音设置而有所不同。
Update: Recently discovered that this audio format succeeds for certain installed voices, but fails for others. It fails specifically for LH Michael and LH Michelle, and failure varies for certain voice settings defined in the PromptBuilder.
推荐答案
这是完全可能的LH Michael和LH米歇尔的声音根本不支持8000Hz的采样率(因为它们本身产生的样本> 8000赫兹)。 SAPI允许引擎拒绝不支持率。
It's entirely possible that the LH Michael and LH Michelle voices simply don't support 8000 Hz sample rates (because they inherently generate samples > 8000 Hz). SAPI allows engines to reject unsupported rates.
这篇关于问SpeechSynthesizer.SetOutputToAudioStream音频格式问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!