本文介绍了通过Spark和Scala从AWS s3读取.conf文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我能够从AWS S3加载文本文件,但是在读取".conf"文件时遇到了问题.文件.得到错误
I was able to load a text file from AWS S3 but facing a problem in reading the ".conf" file. Getting the error
Scala代码:
val configFile1 = ConfigFactory.load( "s3n://<bucket_name>/aws.conf" )
configFile1.getString("spark.lineage.key")
推荐答案
我最终要做的是,创建一个包装器实用程序 Config.scala
Here what I end up doing it, Create a wrapper utility Config.scala
import java.io.File
import com.amazonaws.auth.DefaultAWSCredentialsProviderChain
import com.amazonaws.services.s3.{AmazonS3Client, AmazonS3URI}
import com.typesafe.config.{ConfigFactory, Config => TConfig}
import scala.io.Source
object Config {
private def read(location: String): String = {
val awsCredentials = new DefaultAWSCredentialsProviderChain()
val s3Client = new AmazonS3Client(awsCredentials)
val s3Uri = new AmazonS3URI(location)
val fullObject = s3Client.getObject(s3Uri.getBucket, s3Uri.getKey)
Source.fromInputStream(fullObject.getObjectContent).getLines.mkString("\n")
}
def apply(location: String): TConfig = {
if (location.startsWith("s3")) {
val content = read(location)
ConfigFactory.parseString(content)
} else {
ConfigFactory.parseFile(new File(location))
}
}
}
使用创建的包装器
val conf: TConfig = Config("s3://config/path")
您可以将提供的
范围用于 aws-java-sdk
,因为它将在EMR群集中可用.
You may use provided
scope for aws-java-sdk
since it will be available in the EMR cluster.
这篇关于通过Spark和Scala从AWS s3读取.conf文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!