本文介绍了如何将WAV文件切成10ms数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我试图将从wav中检索到的数据划分为10ms的段,以进行动态时间扭曲.
I am trying to divide the data I retrieve from a wav into 10ms segments for dynamic time warping.
import wave
import contextlib
data = np.zeros((1, 7000))
rate, wav_data = wavfile.read(file_path)
with contextlib.closing(wave.open(file_path, 'r')) as f:
frames = f.getnframes()
rate = f.getframerate()
duration = frames / float(rate)
是否有任何现有的库可以做到这一点
Is there any existing library that do that
谢谢
推荐答案
如果您对数据进行后期处理感兴趣,则可以将其作为numpy数据使用.
If you're interested in post-processing the data, you'll probably be working with it as numpy data.
>>> import wave
>>> import numpy as np
>>> f = wave.open('911.wav', 'r')
>>> data = f.readframes(f.getnframes())
>>> data[:10] # just to show it is a string of bytes
'"5AMj\x88\x97\xa6\xc0\xc9'
>>> numeric_data = np.fromstring(data, dtype=np.uint8)
>>> numeric_data
array([ 34, 53, 65, ..., 128, 128, 128], dtype=uint8)
>>> 10e-3*f.getframerate() # how many frames per 10ms?
110.25
这不是整数,因此除非您要对数据进行插值,否则需要将数据填充零以获取110帧长的样本(在此帧速下大约为10ms).
That's not an integer number, so unless you're going to interpolate your data, you'll need to pad your data with zeros to get nice 110 frames long samples (which are about 10ms at this framerate).
>>> numeric_data.shape, f.getnframes() # there are just as many samples in the numpy array as there were frames
((186816,), 186816)
>>> padding_length = 110 - numeric_data.shape[0]%110
>>> padded = np.hstack((numeric_data, np.zeros(padding_length)))
>>> segments = padded.reshape(-1, 110)
>>> segments
array([[ 34., 53., 65., ..., 216., 222., 228.],
[ 230., 227., 224., ..., 72., 61., 45.],
[ 34., 33., 32., ..., 147., 158., 176.],
...,
[ 128., 128., 128., ..., 128., 128., 128.],
[ 127., 128., 128., ..., 128., 129., 129.],
[ 129., 129., 128., ..., 0., 0., 0.]])
>>> segments.shape
(1699, 110)
因此,现在segments
数组的每一行都长约10ms.
So now, every row of the segments
array is about 10ms long.
这篇关于如何将WAV文件切成10ms数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!