将数据从数据库移至Azure Blob存储

本文介绍了将数据从数据库移至Azure Blob存储的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我可以使用 dask.dataframe .read_sql_table 读取数据，例如df = dd.read_sql_table(table='TABLE', uri=uri, index_col='field', npartitions=N)

I'm able to use dask.dataframe.read_sql_table to read the data e.g. df = dd.read_sql_table(table='TABLE', uri=uri, index_col='field', npartitions=N)

将其保存为Azure Blob存储中的拼花文件的下一步(最佳)是什么?

What would be the next (best) steps to saving it as a parquet file in Azure blob storage?

根据我的小型研究，有两种选择:

From my small research there are a couple of options:

本地保存并使用"> https://docs.microsoft.com/zh-CN/azure/storage/common/storage-use-azcopy-blobs?toc=/azure/storage/blobs/toc.json (不适用于大数据)
我相信 adlfs 是从blob中读取的
使用 dask.dataframe.to_parquet 并弄清楚如何指向Blob容器
intake 项目(不确定从何处开始)

Save locally and use https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-blobs?toc=/azure/storage/blobs/toc.json (not great for big data)
I believe adlfs is to read from blob
use dask.dataframe.to_parquet and work out how to point to the blob container
intake project (not sure where to start)

推荐答案

$ pip install adlfs

dd.to_parquet(
    df=df,
    path='absf://{BLOB}/{FILE_NAME}.parquet',
    storage_options={'account_name': 'ACCOUNT_NAME',
                     'account_key': 'ACCOUNT_KEY'},
    )

这篇关于将数据从数据库移至Azure Blob存储的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！