本文介绍了将 Firehose 传输的文件存储在 S3 中的自定义目录名称下的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们主要通过 Kinesis Firehose 服务批量传输传入的点击流数据.我们的系统是一个多租户SaaS平台.传入的点击流数据通过 Firehose 存储在 S3 中.默认情况下,所有文件都存储在按给定日期格式命名的目录下.我想通过API指定Firehose平面中数据文件的目录路径,以便隔离客户数据.

We primarily do bulk transfer of incoming click stream data through Kinesis Firehose service. Our system is a multi tenant SaaS platform. The incoming click stream data are stored S3 through Firehose. By default, all the files are stored under directories named per given date-format. I would like to specify the directory path for the data files in Firehose planel through API in order to segregate the customer data.

例如,我希望在 S3 中为客户 A、B 和 C 提供的目录结构:

For example, the directory structure that I would like to have in S3 for customers A, B and C :

/A/2017/10/12/

/B/2017/10/12/

/B/2017/10/12/

/C/2017/10/12/

/C/2017/10/12/

我该怎么做?

推荐答案

您可以通过配置 S3 前缀来分隔您的目录.在控制台中,这是在设置期间设置 S3 存储桶名称时完成的.

You can separate your directories by configuring the S3 Prefix. In the console, this is done during setup when you set the S3 bucket name.

使用 CPI,您可以在 --s3-destination-configuration 中设置前缀,如下所示:

Using the CPI, you set the prefix in the --s3-destination-configuration as shown here:

http://docs.aws.amazon.com/cli/latest/reference/firehose/create-delivery-stream.html

但是请注意,您只能为每个 Firehose Delivery Stream 设置一个前缀,因此如果您通过一个 Firehose Delivery Stream 传递所有点击流数据,您将无法将记录发送到不同的前缀.

Note however, you can only set one prefix per Firehose Delivery Stream, so if you're passing all of your clickstream data through one Firehose Delivery Stream you will not be able to send the records to different prefixes.

这篇关于将 Firehose 传输的文件存储在 S3 中的自定义目录名称下的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-11 07:22