本文介绍了在自定义目录名称下将Firehose传输的文件存储在S3中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们主要通过Kinesis Firehose服务批量传输传入的点击流数据.我们的系统是一个多租户SaaS平台.传入的点击流数据通过Firehose存储S3.默认情况下,所有文件都存储在以给定日期格式命名的目录下.我想通过API为Firehose planel \中的数据文件指定目录路径,以便隔离客户数据.

We primarily do bulk transfer of incoming click stream data through Kinesis Firehose service. Our system is a multi tenant SaaS platform. The incoming click stream data are stored S3 through Firehose. By default, all the files are stored under directories named per given date-format. I would like to specify the directory path for the data files in Firehose planel \ through API in order to segregate the customer data.

例如,我希望在S3中为客户A,B和C提供目录结构:

For example, the directory structure that I would like to have in S3 for customers A, B and C :

/ B /2017/10/12/

/B/2017/10/12/

/ C /2017/10/12/

/C/2017/10/12/

我该怎么办?

推荐答案

您可以通过配置S3前缀来分隔目录.在控制台中,这是在设置S3存储桶名称时在设置过程中完成的.

You can separate your directories by configuring the S3 Prefix. In the console, this is done during setup when you set the S3 bucket name.

使用CPI,在--s3-destination-configuration中设置前缀,如下所示:

Using the CPI, you set the prefix in the --s3-destination-configuration as shown here:

http://docs.aws .amazon.com/cli/latest/reference/firehose/create-delivery-stream.html

但是请注意,每个Firehose Delivery Stream只能设置一个前缀,因此,如果将所有Clickstream数据通过一个Firehose Delivery Stream传递,则无法将记录发送到不同的前缀.

Note however, you can only set one prefix per Firehose Delivery Stream, so if you're passing all of your clickstream data through one Firehose Delivery Stream you will not be able to send the records to different prefixes.

这篇关于在自定义目录名称下将Firehose传输的文件存储在S3中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-12 20:37