问题描述
我用在Android的语音识别功能,我喜欢它。这是我的客户最受赞赏的特点之一。然而,格式是有点限制性的。你必须调用识别目的,已发送录音转录到谷歌,并等待背课文。
I've used the voice recognition feature on Android and I love it. It's one of my customers' most praised features. However, the format is somewhat restrictive. You have to call the recognizer intent, have it send the recording for transcription to google, and wait for the text back.
我的一些想法需要记录在我的应用程序中的音频,然后发送剪辑谷歌的转录。
Some of my ideas would require recording the audio within my app and then sending the clip to google for transcription.
有没有什么办法可以发送音频剪辑与语音处理文本?
Is there any way I can send an audio clip to be processed with speech to text?
推荐答案
我得到的运作良好具有语音识别和音频录制解决方案。这里是链接以我创建的显示解决方案的工作简单的Android项目。另外,我把一些打印屏幕上的项目中说明的应用程序。
I got a solution that is working well to have speech recognizing and audio recording. Here is the link to a simple Android project I created to show the solution's working. Also, I put some print screens inside the project to illustrate the app.
我会尽量简明扼要地解释我所使用的方法。我结合该项目中两个特点:谷歌语音API和FLAC记录
I'm gonna try to explain briefly the approach I used. I combined two features in that project: Google Speech API and Flac recording.
谷歌语音API是通过HTTP连接调用。 迈克Pultz 提供有关API的更多细节:
Google Speech API is called through HTTP connections. Mike Pultz gives more details about the API:
(...)新(谷歌)API是一个全双工流API。这句话的意思是,它实际上使用两个HTTP连接 - 一个职位的要求上载的内容为活分块流和第二GET请求来访问的结果,这使得更多的意义为更长的音频样本,或流式音频。
"(...) the new [Google] API is a full-duplex streaming API. What this means, is that it actually uses two HTTP connections- one POST request to upload the content as a "live" chunked stream, and a second GET request to access the results, which makes much more sense for longer audio samples, or for streaming audio."
然而,该API需要接收FLAC声音文件才能正常工作。这使得我们去的第二部分:拉克记录
However, this API needs to receive a FLAC sound file to work properly. That makes us to go to the second part: Flac recording
我通过提取和从一个开源应用程序称为AudioBoo适应几张code和库中实现拉克记录在该项目。 AudioBoo使用本机code录制和播放FLAC格式。
I implemented Flac recording in that project through extracting and adapting some pieces of code and libraries from an open source app called AudioBoo. AudioBoo uses native code to record and play flac format.
因此,它可以记录后手的声音,将其发送给谷歌语音API,获取文本,并播放刚刚录制的声音。
Thus, it's possible to record a flac sound, send it to Google Speech API, get the text, and play the sound that was just recorded.
我创造了该项目的基本原则,使其工作,可以针对具体情况加以改进。为了使其在不同的情况下工作,它需要得到谷歌语音API密钥,它是由被谷歌铬开发组的一部分获得。我离开一键在该项目只是为了显示它的工作,但我会最终将其删除。如果有人需要更多的相关信息,让我知道,因为我不能把2个以上的链接在这个岗位。
The project I created has the basic principles to make it work and can be improved for specific situations. In order to make it work in a different scenario, it's necessary to get a Google Speech API key, which is obtained by being part of Google Chromium-dev group. I left one key in that project just to show it's working, but I'll remove it eventually. If someone needs more information about it, let me know cause I'm not able to put more than 2 links in this post.
这篇关于在Android与录制的声音片段的语音识别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!