下面我将为你提供一个使用 Spring Boot 框架实现文字转语音和语音转文字功能,并对外提供接口的详细教程。
步骤 1: 创建 Spring Boot 项目
使用 Spring Initializr(https://start.spring.io/)创建一个新的 Spring Boot 项目。在添加依赖时选择 "Spring Web" 和 "Thymeleaf"。
步骤 2: 依赖配置
在项目的 pom.xml
文件中,添加百度语音合成和语音识别的依赖:
<dependencies>
<!-- Spring Web -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- Thymeleaf for HTML templates -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-thymeleaf</artifactId>
</dependency>
<!-- Baidu AIP SDK -->
<dependency>
<groupId>com.baidu.aip</groupId>
<artifactId>java-sdk</artifactId>
<version>4.20.0</version>
</dependency>
</dependencies>
步骤 3: 配置 Baidu AIP
在 src/main/resources
目录下创建 application.properties
文件,添加百度语音合成和语音识别的配置:
baidu.app.id=YOUR_APP_ID
baidu.api.key=YOUR_API_KEY
baidu.secret.key=YOUR_SECRET_KEY
步骤 4: 创建 Controller 类
在 src/main/java/com/example/demo
目录下创建一个名为 BaiduAipController.java
的类:
import com.baidu.aip.speech.AipSpeech;
import com.baidu.aip.speech.TtsResponse;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.http.MediaType;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import java.util.HashMap;
@RestController
@RequestMapping("/baidu-aip")
public class BaiduAipController {
@Value("${baidu.app.id}")
private String appId;
@Value("${baidu.api.key}")
private String apiKey;
@Value("${baidu.secret.key}")
private String secretKey;
private AipSpeech initAipSpeechClient() {
return new AipSpeech(appId, apiKey, secretKey);
}
@PostMapping(value = "/text-to-speech", consumes = MediaType.APPLICATION_JSON_VALUE, produces = MediaType.APPLICATION_OCTET_STREAM_VALUE)
public byte[] textToSpeech(@RequestBody TextToSpeechRequest request) {
AipSpeech client = initAipSpeechClient();
HashMap<String, Object> options = new HashMap<>();
options.put("spd", request.getSpeed());
options.put("pit", request.getPitch());
options.put("vol", request.getVolume());
options.put("per", request.getPersonality());
TtsResponse response = client.synthesis(request.getText(), "zh", 1, options);
if (response.isSuccess()) {
return response.getData();
} else {
throw new RuntimeException("Failed to convert text to speech. Error: " + response.getErrorCode() + ", " + response.getErrorMsg());
}
}
@PostMapping(value = "/speech-to-text", consumes = MediaType.APPLICATION_OCTET_STREAM_VALUE)
public String speechToText(@RequestBody byte[] audioData) {
AipSpeech client = initAipSpeechClient();
HashMap<String, Object> options = new HashMap<>();
options.put("dev_pid", 1536); // 普通话(支持简单的英文识别)
String result = client.asr(audioData, "wav", 16000, options).toString();
if (result.contains("err_msg")) {
throw new RuntimeException("Failed to convert speech to text. Error: " + result);
} else {
return result;
}
}
}
步骤 5: 创建请求模型类
在 src/main/java/com/example/demo
目录下创建一个名为 TextToSpeechRequest.java
的类:
public class TextToSpeechRequest {
private String text;
private String speed;
private String pitch;
private String volume;
private String personality;
// getters and setters
}
步骤 6: 创建前端页面
在 src/main/resources/templates
目录下创建一个名为 index.html
的HTML文件:
<!DOCTYPE html>
<html lang="en" xmlns:th="http://www.thymeleaf.org">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Baidu AIP Demo</title>
</head>
<body>
<h2>Text to Speech</h2>
<form action="/baidu-aip/text-to-speech" method="post" enctype="application/json">
<label for="text">Text:</label>
<input type="text" id="text" name="text" required>
<br>
<label for="speed">Speed (1-15):</label>
<input type="number" id="speed" name="speed" min="1" max="15" value="5">
<br>
<label for="pitch">Pitch (1-15):</label>
<input type="number" id="pitch" name="pitch" min="1" max="15" value="5">
<br>
<label for="volume">Volume (1-15):</label>
<input type="number" id="volume" name="volume" min="1" max="15" value="5">
<br>
<label for="personality">Personality (0 for female, 1 for male):</label>
<input type="number" id="personality" name="personality" min="0" max="1" value="0">
<br>
<button type="submit">Convert to Speech</button>
</form>
<hr>
<h2>Speech to Text</h2>
<form action="/baidu-aip/speech-to-text" method="post" enctype="application/octet-stream">
<label for="audio">Upload Audio File:</label>
<input type="file" id="audio" name="audio" accept=".wav" required>
<br>
<button type="submit">Convert to Text</button>
</form>
</body>
</html>
步骤 7: 运行应用
在项目的根目录下执行以下命令启动应用:
mvn spring-boot:run