本文介绍了AWS数据管道EmrCluster的安全配置字段的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我通过AWS管理控制台上的常规EMR群集向导创建了一个AWS EMR群集,并且能够选择一个安全配置,例如,在导出CLI命令--security-configuration 'mySecurityConfigurationValue'时.

I created an AWS EMR Cluster through the regular EMR Cluster wizard on the AWS Management Console and I was able to select a security-configuration e.g., when you export the CLI command it's --security-configuration 'mySecurityConfigurationValue'.

我现在需要通过AWS Data Pipeline创建一个类似的EMR,但是我看不到任何可以指定此安全配置字段的选项.

I now need to create a similar EMR through the AWS Data Pipeline but I don't see any options where I can specify this security-configuration field.

我看到的唯一类似字段是EmrManagedSlaveSecurityGroup,EmrManagedMasterSecurityGroup,AdditionalSlaveSecurityGroups,AdditionalMasterSecurityGroups和SubnetId.我已经在Pipeline配置中填写了所有这些信息,但是我只需要指定安全性配置即可.有什么想法吗?

The only similar fields I see are EmrManagedSlaveSecurityGroup, EmrManagedMasterSecurityGroup, AdditionalSlaveSecurityGroups, AdditionalMasterSecurityGroups, and SubnetId. I already have all of those filled out in my Pipeline configuration but I just need to also specify the security-configuration. Any thoughts?

推荐答案

不幸的是,DataPipeline不支持安全配置"功能(以及EMR 5.x版本中引入的其他功能,例如使用自定义AMI).

Unfortunately, DataPipeline does not support the Security Configurations feature (as well as other features that were introduced in the EMR 5.x versions like using a custom AMI).

对此的一种解决方案是:

One solution for this is to:

  1. 用EC2资源替换管道中的EmrCluster
  2. 在EC2资源上使用ShellCommandActivity来运行aws emr create-cluster CLI命令
  3. 使用引导步骤来在群集上安装TaskRunner
  4. workerGroup替换管道中的所有runsOn属性,以便任务在您在步骤2中创建的EMR集群上运行.
  5. 在管道末尾添加最后一个ShellCommandActivity以使用CLI终止集群
  1. Replace the EmrCluster in your pipeline with an EC2 resource
  2. Use a ShellCommandActivity on the EC2 resource to run the aws emr create-cluster CLI command
  3. Use a bootstrap step to install TaskRunner on the cluster
  4. Replace all the runsOn properties in your pipeline with workerGroup so the tasks run on the EMR cluster you created in step 2
  5. Add a final ShellCommandActivity at the end of the pipeline to terminate the cluster using CLI

现在,由于您正在使用CLI来扩展群集,因此您可以访问各种功能,例如安全性配置,自定义AMI,实例队列等,并且仍然可以使用DataPipeline来编排任务.

Now since you are spinning up your cluster using the CLI you have access to all kinds of features like security configurations, custom AMI, instance fleets, etc. and you can still orchestrate the tasks using DataPipeline.

这篇关于AWS数据管道EmrCluster的安全配置字段的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-01 20:12