java - Bing文字转语音在Android中不起作用

因此，我克隆了示例Cognitive-Speech-TTS并测试了Android TTS，但仍无法正常工作，我听不到任何结果/声音，我已经完成了必要的要求，例如对API订阅密钥进行上位并进行管理。所以这里的Logcat结果

com.microsoft.sdksample W/ResourceType: Found multiple library tables, ignoring...
com.microsoft.sdksample W/ResourceType: Found multiple library tables, ignoring...
com.microsoft.sdksample W/ResourceType: Found multiple library tables, ignoring...
com.microsoft.sdksample D/Authentication: new Access Token: ******************
com.microsoft.sdksample D/OpenGLRenderer: Use EGL_SWAP_BEHAVIOR_PRESERVED: true
com.microsoft.sdksample D/Atlas: Validating map...
com.microsoft.sdksample I/Adreno-EGL: <qeglDrvAPI_eglInitialize:410>: EGL 1.4 QUALCOMM build: AU_LINUX_ANDROID_LA.BF.1.1.1_RB1.05.01.00.042.030_msm8226_LA.BF.1.1.1_RB1__release_AU ()
                                                                     OpenGL ES Shader Compiler Version: E031.25.03.06
                                                                     Build Date: 06/10/15 Wed
                                                                     Local Branch:
                                                                     Remote Branch: quic/LA.BF.1.1.1_rb1.24
                                                                     Local Patches: NONE
                                                                     Reconstruct Branch: AU_LINUX_ANDROID_LA.BF.1.1.1_RB1.05.01.00.042.030 + 6151be1 +  NOTHING
com.microsoft.sdksample I/OpenGLRenderer: Initialized EGL, version 1.4
com.microsoft.sdksample D/OpenGLRenderer: Enabling debug mode 0
com.microsoft.sdksample I/Timeline: Timeline: Activity_idle id: android.os.BinderProxy@166d0b8f time:2397803

最佳答案

我通过使用XmlDom类获取SSML解决了该问题

字符串正文= XmlDom.createDom（deviceLanguage，genderName，voiceName，“此处输入文字”）；

byte [] xmlBytes = body.getBytes（）;

urlConnection.setRequestProperty（“ content-length”，String.valueOf（xmlBytes.length））;

public class XmlDom {
    public static String createDom(String locale, String genderName, String voiceName, String textToSynthesize){
        Document doc = null;
        Element speak, voice;
        try {
            DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
            DocumentBuilder builder = dbf.newDocumentBuilder();
            doc = builder.newDocument();
            if (doc != null){
                speak = doc.createElement("speak");
                speak.setAttribute("version", "1.0");
                speak.setAttribute("xml:lang", "en-us");
                voice = doc.createElement("voice");
                voice.setAttribute("xml:lang", locale);
                voice.setAttribute("xml:gender", genderName);
                voice.setAttribute("name", voiceName);                        voice.appendChild(doc.createTextNode(textToSynthesize));
                speak.appendChild(voice);
                doc.appendChild(speak);
            }
        } catch (ParserConfigurationException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        return transformDom(doc);
    }

    private static String transformDom(Document doc){
        StringWriter writer = new StringWriter();
        try {
            TransformerFactory tf = TransformerFactory.newInstance();
            Transformer transformer;
            transformer = tf.newTransformer();
            transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
            transformer.transform(new DOMSource(doc), new StreamResult(writer));
        } catch (TransformerException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
        return writer.getBuffer().toString().replaceAll("\n|\r", "");
    }
}

更新：

使用XmlDom类获取SSML后，我发现SSML需要在语音标签中指定xml：lang ='YOU_LANGUAGE_HERE'。例如

<speak version='1.0' xml:lang='en-US'><voice xml:lang='en-US' xml:gender='Female' name='Microsoft Server Speech Text to Speech Voice (en-US, ZiraRUS)'>This is a demo of Microsoft Cognitive Services Text to Speech API.</voice></speak>