本文介绍了Web Audio API - 块之间的实时流“点击".的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过 node.js (express) 服务器上的 websocket 将音频流式传输到 Web 浏览器.音频来自 iOS 设备,为 16 位单声道 wav 文件,采样率为 4k(每秒 4000 个样本).

I am trying to stream audio through a websocket on a node.js (express) server to a web browser. The audio is coming from an iOS device as 16-bit, mono wav files sampled at 4k (4000 samples per second).

这是我的代码:

服务器代码:

webSocketServer.on('connection', function connection(client) {
  client.on('message', function(message) {
    webSocketServer.clients.forEach(function each(connection) {
      connection.send(message, { binary: true }
    );
  });
});

客户代码:

webSocket = new WebSocket('ws://' + window.location.hostname + ':8080/');
webSocket.binaryType = 'arraybuffer'
webSocket.onmessage = function(message) {
  var arrayBuffer = message.data // wav from server, as arraybuffer
  var source = audioContext.createBufferSource();
  audioContext.decodeAudioData(arrayBuffer, function(buffer){
    source.buffer = buffer
    source.connect(audioContext.destination)
    source.start(time);
    time += source.buffer.duration
  }, function(){
    console.log('error')
  })
};

decodeAudioData() 似乎正在工作,但是它返回的音频缓冲区是我期望的长度的一半.(例如 4000 个样本只会给我 0.5 秒的音频.我最初认为这是因为 wav 是 16 位而不是 32,但切换到 32 导致 decodeAudioData() 触发它的错误回调.

decodeAudioData()appears to be working, however the audio buffer it returns is half the length I was expecting. (eg 4000 samples will only give me 0.5 seconds of audio. I originally thought this was because the wav is 16 bit and not 32, but switching to 32 caused decodeAudioData() to trigger it's error callback.

我认为可以将此解决方法添加到成功回调中:

I figured this workaround could be added to the success callback:

source.playbackRate.value = 0.5 // play at half speed
time += source.buffer.duration * 2 // double duration

这让时间完美正常工作,但我遇到了一个问题:在音频块之间有可听见的咔嗒"声或啪"声.在将块间隔一秒(time += (source.buffer.duration * 2) + 1)后,我发现点击发生在每个块的最开始.

This gets the timing to work perfectly, but I am left with one problem: There is an audible 'click' or 'pop' between audio chunks. After spacing out the chunks by one second (time += (source.buffer.duration * 2) + 1), I was able to find that the click happens at the very beginning of each chunk.

所以我主要的两个令人头疼的是:

So my main two head-scratchers are:

1) 为什么解码后的音频播放速度是我期望的两倍?对于 Web Audio API,我的 4k 采样率是否太低?为什么我无法解码 32 位 wav?

1) Why is the decoded audio playing at twice the speed I am expecting? Is my sampling rate of 4k too low for the Web Audio API? Why can't I decode 32-bit wav's?

2) 我在数字音频工作站(ableton、logic)方面有一些经验,我知道如果波形从样本跳"回零,反之亦然(即:开始/结束正弦),会产生咔哒声在一个阶段的中间波动).这就是这里发生的事情吗?有没有办法解决这个问题?淡入淡出每个单独的样本似乎很愚蠢.为什么每个块都没有从最后一个停止的地方开始?

2) I have some experience with digital audio workstations (ableton, logic) and I know that clicking sounds can arise if a wave 'jumps' from a sample back down to zero or vice versa (ie: starting/ending a sine wave in the midst of a phase). Is that what's going on here? Is there a way to get around this? Crossfading each individual sample seems silly. Why doesn't each chunk pickup where the last one left off?

推荐答案

1) 我接收到的音频实际上是错误的 2k,但 wav header 仍然显示 4k,因此双倍速错误.

1) The audio I was receiving was actually at 2k by mistake, but the wav header still said 4k, thus the double speed error.

2) 请参阅 Chris Wilsons 回答的最后一段:

最后 - 如果声音流与默认音频设备的采样率不匹配,这将无法正常工作;总是会有点击,因为 decodeAudioData 将重新采样到设备速率,这不会有完美的持续时间.它会起作用,但可能会出现像块边界处的点击之类的工件.您需要一项尚未规范或实施的功能 - 可选择的 AudioContext 采样率 - 以解决此问题.

Brion Vibbers AudioFeeder.js 运行良好,无需任何点击,但需要原始 32 位 pcm 数据.还要警惕上采样伪影

Brion Vibbers AudioFeeder.js works great without any clicks but requires raw 32bit pcm data. Also be wary of upsampling artifacts!

这篇关于Web Audio API - 块之间的实时流“点击".的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-01 08:58