本文介绍了在同一 Cloudformation 堆栈中连接 Athena 和 S3的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从文档中,AWS::Athena::NamedQuery,不清楚如何将 Athena 附加到同一堆栈中指定的 S3 存储桶.

From the documentation, AWS::Athena::NamedQuery, it is unclear how to attach Athena to an S3 bucket specified in the same stack.

如果我不得不从 example,我想你可以写一个模板,

If I had to guess from the example, I would imagine that you can write a template like,

Resources:
  MyS3Bucket:
    Type: AWS::S3::Bucket
       ... other params ...

  AthenaNamedQuery:
    Type: AWS::Athena::NamedQuery
    Properties:
      Database: "db_name"
      Name: "MostExpensiveWorkflow"
      QueryString: >
                    CREATE EXTERNAL TABLE db_name.test_table 
                    (...) LOCATION s3://.../path/to/folder/

像上面这样的模板会起作用吗?创建堆栈后,表 db_name.test_table 是否可用于运行查询?

Would a template like the above work? Upon stack creation, will the table db_name.test_table be available to run queries on?

推荐答案

原来连接 S3 和 Athena 的方法是制作一个 Glue 表!我多傻啊!!当然,胶水是您连接事物的方式!

Turns out the way you connect the S3 and Athena is to make a Glue table! How silly of me!! Of course Glue is how you connect things!

撇开讽刺不谈,这是一个在使用 AWS::Glue::TableAWS::Glue::Database,

Sarcasm aside, this is a template that worked for me when using AWS::Glue::Table and AWS::Glue::Database,

Resources:
  MyS3Bucket:
    Type: AWS::S3::Bucket

  MyGlueDatabase:
    Type: AWS::Glue::Database
    Properties:
      DatabaseInput:
        Name: my-glue-database
        Description: "Glue beats tape"
      CatalogId: !Ref AWS::AccountId

  MyGlueTable:
    Type: AWS::Glue::Table
    Properties:
      DatabaseName: !Ref MyGlueDatabase
      CatalogId: !Ref AWS::AccountId
      TableInput:
        Name: my-glue-table
        Parameters: { "classification" : "csv" }
        StorageDescriptor:
          Location:
            Fn::Sub: "s3://${MyS3Bucket}/"
          InputFormat: "org.apache.hadoop.mapred.TextInputFormat"
          OutputFormat: "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"
          SerdeInfo:
            Parameters: { "separatorChar" : "," }
            SerializationLibrary: "org.apache.hadoop.hive.serde2.OpenCSVSerde"
          StoredAsSubDirectories: false
          Columns:
            - Name: column0
              Type: string
            - Name: column1
              Type: string

此后,数据库和表都在 AWS Athena 控制台中!

After this, the database and table were in the AWS Athena Console!

这篇关于在同一 Cloudformation 堆栈中连接 Athena 和 S3的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-26 22:14