因此,我克隆了示例Cognitive-Speech-TTS并测试了Android TTS,但仍无法正常工作,我听不到任何结果/声音,我已经完成了必要的要求,例如对API订阅密钥进行上位并进行管理。所以这里的Logcat结果
com.microsoft.sdksample W/ResourceType: Found multiple library tables, ignoring...
com.microsoft.sdksample W/ResourceType: Found multiple library tables, ignoring...
com.microsoft.sdksample W/ResourceType: Found multiple library tables, ignoring...
com.microsoft.sdksample D/Authentication: new Access Token: ******************
com.microsoft.sdksample D/OpenGLRenderer: Use EGL_SWAP_BEHAVIOR_PRESERVED: true
com.microsoft.sdksample D/Atlas: Validating map...
com.microsoft.sdksample I/Adreno-EGL: <qeglDrvAPI_eglInitialize:410>: EGL 1.4 QUALCOMM build: AU_LINUX_ANDROID_LA.BF.1.1.1_RB1.05.01.00.042.030_msm8226_LA.BF.1.1.1_RB1__release_AU ()
OpenGL ES Shader Compiler Version: E031.25.03.06
Build Date: 06/10/15 Wed
Local Branch:
Remote Branch: quic/LA.BF.1.1.1_rb1.24
Local Patches: NONE
Reconstruct Branch: AU_LINUX_ANDROID_LA.BF.1.1.1_RB1.05.01.00.042.030 + 6151be1 + NOTHING
com.microsoft.sdksample I/OpenGLRenderer: Initialized EGL, version 1.4
com.microsoft.sdksample D/OpenGLRenderer: Enabling debug mode 0
com.microsoft.sdksample I/Timeline: Timeline: Activity_idle id: android.os.BinderProxy@166d0b8f time:2397803
最佳答案
我通过使用XmlDom类获取SSML解决了该问题
字符串正文= XmlDom.createDom(deviceLanguage,genderName,voiceName,“此处输入文字”);
byte [] xmlBytes = body.getBytes();
urlConnection.setRequestProperty(“ content-length”,String.valueOf(xmlBytes.length));
public class XmlDom {
public static String createDom(String locale, String genderName, String voiceName, String textToSynthesize){
Document doc = null;
Element speak, voice;
try {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = dbf.newDocumentBuilder();
doc = builder.newDocument();
if (doc != null){
speak = doc.createElement("speak");
speak.setAttribute("version", "1.0");
speak.setAttribute("xml:lang", "en-us");
voice = doc.createElement("voice");
voice.setAttribute("xml:lang", locale);
voice.setAttribute("xml:gender", genderName);
voice.setAttribute("name", voiceName); voice.appendChild(doc.createTextNode(textToSynthesize));
speak.appendChild(voice);
doc.appendChild(speak);
}
} catch (ParserConfigurationException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return transformDom(doc);
}
private static String transformDom(Document doc){
StringWriter writer = new StringWriter();
try {
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer;
transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.transform(new DOMSource(doc), new StreamResult(writer));
} catch (TransformerException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return writer.getBuffer().toString().replaceAll("\n|\r", "");
}
}
更新:
使用XmlDom类获取SSML后,我发现SSML需要在语音标签中指定xml:lang ='YOU_LANGUAGE_HERE'。例如
<speak version='1.0' xml:lang='en-US'><voice xml:lang='en-US' xml:gender='Female' name='Microsoft Server Speech Text to Speech Voice (en-US, ZiraRUS)'>This is a demo of Microsoft Cognitive Services Text to Speech API.</voice></speak>