问题描述
我正在尝试编写一个Python脚本来处理S3上存储的音频数据.
I'm trying to write a Python script for processing audio data stored on S3.
我有一个正在使用的S3对象
I have an S3 object which I'm calling using
def grabAudio(filename, directory):
obj = s3client.get_object(Bucket=bucketname, Key=directory+'/'+filename)
return obj['Body'].read()
使用访问数据
print(obj['Body'].read())
产生正确的音频信息.因此,它可以从存储桶访问数据.
yields the correct audio information. So its accessing the data from the bucket just fine.
当我尝试在音频处理库(pydub)中使用此数据时,它失败:
When I try to then use this data in my audio processing library (pydub), it fails:
audio = AudioSegment.from_wav(grabAudio(filename, bucketname))
audio = AudioSegment.from_wav(grabAudio(filename, bucketname))
Traceback (most recent call last): File "split_audio.py", line 38, in <module> audio = AudioSegment.from_wav(grabAudio(filename, bucketname)) File "C:\Users\jmk_m\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pydub\audio_segment.py", line 544, in from_wav return cls.from_file(file, 'wav', parameters) File "C:\Users\jmk_m\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pydub\audio_segment.py", line 456, in from_file file.seek(0)AttributeError: 'bytes' object has no attribute 'seek'
Traceback (most recent call last): File "split_audio.py", line 38, in <module> audio = AudioSegment.from_wav(grabAudio(filename, bucketname)) File "C:\Users\jmk_m\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pydub\audio_segment.py", line 544, in from_wav return cls.from_file(file, 'wav', parameters) File "C:\Users\jmk_m\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pydub\audio_segment.py", line 456, in from_file file.seek(0)AttributeError: 'bytes' object has no attribute 'seek'
从s3传入的对象的格式是什么?我认为字节数组?如果是这样,是否有一种无需保存到磁盘即可将其解析为.wav格式的方法?我试图避免保存到磁盘.
What is the format of the object coming in from s3? Byte array I presume? If so, is there a way of parsing it into a .wav format without having to save to disk? I'm trying to refrain from saving to disk.
还向其他音频处理库开放.
Also open to alternative audio processing libraries.
推荐答案
感谢Linas链接了一个类似的问题,而Jiaaro给出了答案.
Thanks to Linas for linking a similar issue, and Jiaaro for the answer.
import io
s = io.BytesIO(y['data'])
AudioSegment.from_file(s).export(x, format='mp3')
允许我通过以下方式直接从存储桶中拉入内存
Allows me to pull directly from the bucket into memory with
obj = s3client.get_object(Bucket=bucketname, Key=customername+'/'+filename)
data = io.BytesIO(obj['Body'].read())
audio = AudioSegment.from_file(data)
这篇关于boto3 S3对象解析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!