本文介绍了SSIS在实木复合地板文件中将源Oledb数据发送到S3存储桶的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的来源是SQL Server,我正在使用SSIS将数据导出到S3存储桶,但是现在我的要求是将文件作为拼花文件格式发送.

My source is SQL Server and I am using SSIS to export data to S3 Buckets, but now my requirement is to send files as parquet File formate.

你们能提供一些有关如何实现这一目标的线索吗?

Can you guys give some clues on how to achieve this?

谢谢,Ven

推荐答案

对于绊倒这个答案的人们, Apache Parquet 是一个指定Hadoop和其他Apache项目采用的列式文件格式的项目.

For folks stumbling on this answer, Apache Parquet is a project that specifies a columnar file format employed by Hadoop and other Apache projects.

除非找到自定义组件或编写一些.NET代码来执行此操作,否则您将无法将数据从SQL Server导出到Parquet文件中. KingswaySoft的SSIS大数据组件可能会提供一个这样的自定义组件,但我并不熟悉.

Unless you find a custom component or write some .NET code to do it, you're not going to be able to export data from SQL Server to a Parquet file. KingswaySoft's SSIS Big Data Components might offer one such custom component, but I've got no familiarity.

如果要导出到Azure,则有两个选择:

If you were exporting to Azure, you'd have two options:

  1. 使用灵活文件目标组件(Azure功能包的一部分),该组件可导出到Azure Blob或Data Lake Gen2存储中托管的Parquet文件.

  1. Use the Flexible File Destination component (part of the Azure feature pack), which exports to a Parquet file hosted in Azure Blob or Data Lake Gen2 storage.

利用PolyBase,一种SQL Server功能.它使您可以通过外部表格功能.但是,该文件必须托管在此处.不幸的是S3不是一个选择.

Leverage PolyBase, a SQL Server feature. It let's you export to a Parquet file via the external table feature. However, that file has to be hosted in a location mentioned here. Unfortunately S3 isn't an option.

如果是我,我会将数据作为CSV文件移动到S3,然后使用Athena将CSV文件转换为Pqrquet.这里有一篇很漂亮的文章,讲述了雅典娜的那篇文章:

If it were me, I'd move the data to S3 as a CSV file then use Athena to convert the CSV file to Pqrquet. There is a nifty article here that talks through the Athena piece:

https://www.cloudforecast.io/博客/Athena-to-transform-CSV-to-Parquet/

网络,您将需要花一些钱,发挥创意,切换到Azure或在AWS中进行转换.

Net-net, you'll need to spend a little money, get creative, switch to Azure, or do the conversion in AWS.

这篇关于SSIS在实木复合地板文件中将源Oledb数据发送到S3存储桶的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-01 20:03