如何使用 pandas 将文件写入S3

本文介绍了如何使用 pandas 将文件写入S3的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想将.ann格式的数据帧列写入S3.

I want to write a data frame column in .ann format to S3.

现在我正在使用下面的代码来做到这一点.

Right now I am using the following code to do that.

df['user_input'].to_csv(ann_file_path, header=None, index=None, sep=' ')

其中ann_file_path是服务器上.ann文件的完整路径.

Where ann_file_path is the full path of the .ann file on the Server.

我收到以下错误消息:

[Errno 22] Invalid argument: 'https://s3-eu-west-1.amazonaws.com/bucket/sub_folder/somefile.ann'

我为什么得到那个?

还需要使用Boto3进行写操作还是可以在S3上以完整路径直接写文件?

Also, do I need to use Boto3 to write or can I directly write the file on S3 with full path?

我认为可能需要为此授权，但是错误消息似乎与授权相关.

I can think of some authorization might be required for that but the error message seems different from something related to authorization.

推荐答案

我已经解决了.我们需要使用access_key_id和secret_key进行AWS握手.

I've resolved. We need AWS handshake using access_key_id and secret_key for AWS.

从存储桶名称(不是https:/...)开始获取URL，因此摆脱掉之前的所有内容.

Get URL starting from the bucket name (not https:/...), hence get rid of whatever before it.

我的网址:https://s3-eu-west-1.amazonaws.com/bucket/sub_folder/somefile.ann

已转换为:bucket/sub_folder/somefile.ann

执行此操作的代码:ann_file_path = ann_file_path.split('.com/', 1)[1]

一旦我得到ann_file_path，我就使用 s3fs python库上传ann文件到服务器.

Once I got ann_file_path, I used s3fs python library to upload the ann file to the server.

bytes_to_write = df['user_input'].to_csv(header=None, index=None).encode()
fs = s3fs.S3FileSystem(key=settings.AWS_ACCESS_KEY_ID, secret=settings.AWS_SECRET_ACCESS_KEY)
with fs.open(ann_file_path, 'wb') as f:
   f.write(bytes_to_write)

这篇关于如何使用 pandas 将文件写入S3的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！