我将如何使用pyaudio来检测现场麦克风的突然敲击声?

最佳答案

我这样做的一种方式:

  • 一次读取一个样本块,
    说0.05秒值得
  • 计算
    块的RMS幅度(平方
    的平方的均方根
    单个样本)
  • 如果块的RMS幅度大于阈值,则为“嘈杂的块”,否则为“安静的块”。
  • 突然轻按可能是一个安静的块,然后是少量的嘈杂块,然后是一个安静的块
  • 如果您从不安静,则您的阈值太低
  • 如果您永远不会收到嘈杂的声音,则说明您的阈值太高

  • 我的应用程序正在录制无人看管的“有趣”噪音,因此只要有嘈杂的声音,它就会录制下来。如果有15秒的嘈杂时间(“遮住耳朵”),则将阈值乘以1.1;如果有15分钟的安静时间(“更难听”),则将阈值乘以0.9。您的应用程序将有不同的需求。

    此外,刚刚注意到我的代码中有关观察到的RMS值的一些注释。在Macbook Pro的内置麦克风上,标准化的音频数据范围为+/- 1.0,输入音量设置为max,一些数据点:
  • 0.003-0.006(-50dB至-44dB)我家中的中央供暖风扇令人讨厌
  • 在同一台笔记本电脑上键入0.010-0.40(-40dB至-8dB)
  • 0.10(-20dB)在1'距离处轻轻地弹指
  • 0.60(-4.4dB)在1'处大声弹响手指

  • 更新:这是一个入门的示例。
    #!/usr/bin/python
    
    # open a microphone in pyAudio and listen for taps
    
    import pyaudio
    import struct
    import math
    
    INITIAL_TAP_THRESHOLD = 0.010
    FORMAT = pyaudio.paInt16
    SHORT_NORMALIZE = (1.0/32768.0)
    CHANNELS = 2
    RATE = 44100
    INPUT_BLOCK_TIME = 0.05
    INPUT_FRAMES_PER_BLOCK = int(RATE*INPUT_BLOCK_TIME)
    # if we get this many noisy blocks in a row, increase the threshold
    OVERSENSITIVE = 15.0/INPUT_BLOCK_TIME
    # if we get this many quiet blocks in a row, decrease the threshold
    UNDERSENSITIVE = 120.0/INPUT_BLOCK_TIME
    # if the noise was longer than this many blocks, it's not a 'tap'
    MAX_TAP_BLOCKS = 0.15/INPUT_BLOCK_TIME
    
    def get_rms( block ):
        # RMS amplitude is defined as the square root of the
        # mean over time of the square of the amplitude.
        # so we need to convert this string of bytes into
        # a string of 16-bit samples...
    
        # we will get one short out for each
        # two chars in the string.
        count = len(block)/2
        format = "%dh"%(count)
        shorts = struct.unpack( format, block )
    
        # iterate over the block.
        sum_squares = 0.0
        for sample in shorts:
            # sample is a signed short in +/- 32768.
            # normalize it to 1.0
            n = sample * SHORT_NORMALIZE
            sum_squares += n*n
    
        return math.sqrt( sum_squares / count )
    
    class TapTester(object):
        def __init__(self):
            self.pa = pyaudio.PyAudio()
            self.stream = self.open_mic_stream()
            self.tap_threshold = INITIAL_TAP_THRESHOLD
            self.noisycount = MAX_TAP_BLOCKS+1
            self.quietcount = 0
            self.errorcount = 0
    
        def stop(self):
            self.stream.close()
    
        def find_input_device(self):
            device_index = None
            for i in range( self.pa.get_device_count() ):
                devinfo = self.pa.get_device_info_by_index(i)
                print( "Device %d: %s"%(i,devinfo["name"]) )
    
                for keyword in ["mic","input"]:
                    if keyword in devinfo["name"].lower():
                        print( "Found an input: device %d - %s"%(i,devinfo["name"]) )
                        device_index = i
                        return device_index
    
            if device_index == None:
                print( "No preferred input found; using default input device." )
    
            return device_index
    
        def open_mic_stream( self ):
            device_index = self.find_input_device()
    
            stream = self.pa.open(   format = FORMAT,
                                     channels = CHANNELS,
                                     rate = RATE,
                                     input = True,
                                     input_device_index = device_index,
                                     frames_per_buffer = INPUT_FRAMES_PER_BLOCK)
    
            return stream
    
        def tapDetected(self):
            print("Tap!")
    
        def listen(self):
            try:
                block = self.stream.read(INPUT_FRAMES_PER_BLOCK)
            except IOError as e:
                # dammit.
                self.errorcount += 1
                print( "(%d) Error recording: %s"%(self.errorcount,e) )
                self.noisycount = 1
                return
    
            amplitude = get_rms( block )
            if amplitude > self.tap_threshold:
                # noisy block
                self.quietcount = 0
                self.noisycount += 1
                if self.noisycount > OVERSENSITIVE:
                    # turn down the sensitivity
                    self.tap_threshold *= 1.1
            else:
                # quiet block.
    
                if 1 <= self.noisycount <= MAX_TAP_BLOCKS:
                    self.tapDetected()
                self.noisycount = 0
                self.quietcount += 1
                if self.quietcount > UNDERSENSITIVE:
                    # turn up the sensitivity
                    self.tap_threshold *= 0.9
    
    if __name__ == "__main__":
        tt = TapTester()
    
        for i in range(1000):
            tt.listen()
    

    关于python - 使用来自现场麦克风的pyaudio检测点击,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/4160175/

    10-12 21:14