本文介绍了Spark/Hadoop在AWS S3上不支持SSE-KMS加密的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用KMS密钥(SSE-KMS)通过服务器端加密在S3上保存rdd,但是出现以下异常:

以下是我的测试代码中的一部分,该代码通过使用SSE-KMS进行加密在S3上编写rdd:

val sparkConf = new SparkConf().
  setMaster("local[*]").
  setAppName("aws-encryption")
val sc = new SparkContext(sparkConf)

sc.hadoopConfiguration.set("fs.s3a.access.key", AWS_ACCESS_KEY)
sc.hadoopConfiguration.set("fs.s3a.secret.key", AWS_SECRET_KEY)
sc.hadoopConfiguration.setBoolean("fs.s3a.sse.enabled", true)
sc.hadoopConfiguration.set("fs.s3a.server-side-encryption-algorithm", "SSE-KMS")
sc.hadoopConfiguration.set("fs.s3a.sse.kms.keyId", KMS_ID)

val s3a = new org.apache.hadoop.fs.s3a.S3AFileSystem
val s3aName = s3a.getClass.getName
sc.hadoopConfiguration.set("fs.s3a.impl", s3aName)

val rdd = sc.parallelize(Seq("one", "two", "three", "four"))
println("rdd is: " + rdd.collect())
rdd.saveAsTextFile(s"s3a://$bucket/$objKey")

尽管如此,我仍可以使用AES256加密在s3上写rdd.

spark/hadoop的KMS密钥加密值是否不同于"SSE-KMS"?

任何人都可以建议我在这里缺少什么或做错什么吗.

环境详细信息如下:

  • 火花:1.6.1
  • Hadoop:2.6.0
  • Aws-Java-Sdk:1.7.4

谢谢.

解决方案

不幸的是,似乎Hadoop的现有版本(即2.8)不支持SSE-KMS:(

以下是观察结果:

  1. 直到Hadoop 2.8.1才支持SSE-KMS
  2. 应该在Hadoop 2.9中引入的SSE-KMS
  3. 在Hadoop 3.0.0alpha版本中,支持SSE-KMS.

相同的观察结果适用于Java的AWS开发工具包

  1. SSE-KMS是在aws-java-sdk 1.9.5中引入的

I am trying to save an rdd on S3 with server side encryption using KMS key (SSE-KMS), But I am getting the following exception:

Following is the piece of my test code to write an rdd on S3 by using SSE-KMS for encryption:

val sparkConf = new SparkConf().
  setMaster("local[*]").
  setAppName("aws-encryption")
val sc = new SparkContext(sparkConf)

sc.hadoopConfiguration.set("fs.s3a.access.key", AWS_ACCESS_KEY)
sc.hadoopConfiguration.set("fs.s3a.secret.key", AWS_SECRET_KEY)
sc.hadoopConfiguration.setBoolean("fs.s3a.sse.enabled", true)
sc.hadoopConfiguration.set("fs.s3a.server-side-encryption-algorithm", "SSE-KMS")
sc.hadoopConfiguration.set("fs.s3a.sse.kms.keyId", KMS_ID)

val s3a = new org.apache.hadoop.fs.s3a.S3AFileSystem
val s3aName = s3a.getClass.getName
sc.hadoopConfiguration.set("fs.s3a.impl", s3aName)

val rdd = sc.parallelize(Seq("one", "two", "three", "four"))
println("rdd is: " + rdd.collect())
rdd.saveAsTextFile(s"s3a://$bucket/$objKey")

Although, I am able to write rdd on s3 with AES256 encryption.

Does spark/hadoop have a different value for KMS key encryption instead of "SSE-KMS"?

Can anyone please suggest what I am missing here or doing wrong.

Environment details as follow:

  • Spark: 1.6.1
  • Hadoop: 2.6.0
  • Aws-Java-Sdk: 1.7.4

Thank you in advance.

解决方案

Unfortunately, It seems like existing version of Hadoop i.e. 2.8 does not support SSE-KMS :(

Following is the observation:

  1. SSE-KMS is not supported till Hadoop 2.8.1
  2. SSE-KMS supposed to be introduced in Hadoop 2.9
  3. In Hadoop 3.0.0alpha version, SSE-KMS is supported.

Same observation w.r.t. AWS SDK for Java

  1. SSE-KMS was introduced in aws-java-sdk 1.9.5

这篇关于Spark/Hadoop在AWS S3上不支持SSE-KMS加密的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-15 03:17