我有一个二进制格式的.wav文件列表(它们来自一个websocket),我想加入一个二进制.wav文件,然后用它来进行语音识别我已经能够使它与以下代码一起工作:
audio = [binary_wav1, binary_wav2,..., binary_wavN] # a list of .wav binary files coming from a socket
audio = [io.BytesIO(x) for x in audio]
# Join wav files
with wave.open('/tmp/input.wav', 'wb') as temp_input:
params_set = False
for audio_file in audio:
with wave.open(audio_file, 'rb') as w:
if not params_set:
temp_input.setparams(w.getparams())
params_set = True
temp_input.writeframes(w.readframes(w.getnframes()))
# Do speech recognition
binary_audio = open('/tmp/input.wav', 'rb').read())
ASR(binary_audio)
问题是我不想将文件
'/tmp/input.wav'
写入磁盘有没有办法不在磁盘上写任何文件就可以做到?谢谢。
最佳答案
拥有一个文件但从不将其放入磁盘的一般解决方案是流。为此,我们使用io
库,它是处理内存流的默认库。您甚至已经在代码前面使用了BytesIO
。
audio = [binary_wav1, binary_wav2,..., binary_wavN] # a list of .wav binary files coming from a socket
audio = [io.BytesIO(x) for x in audio]
# Join wav files
params_set = False
temp_file = io.BytesIO()
with wave.open(temp_file, 'wb') as temp_input:
for audio_file in audio:
with wave.open(audio_file, 'rb') as w:
if not params_set:
temp_input.setparams(w.getparams())
params_set = True
temp_input.writeframes(w.readframes(w.getnframes()))
#move the cursor back to the beginning of the "file"
temp_file.seek(0)
# Do speech recognition
binary_audio = temp_file.read()
ASR(binary_audio)
注意,我没有任何.wav文件可供试用这取决于
wave
库来正确处理实际文件和缓冲流之间的差异。