本文介绍了通过Spark和Scala从AWS s3读取.conf文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我能够从AWS S3加载文本文件,但是在读取".conf"文件时遇到了问题.文件.得到错误

I was able to load a text file from AWS S3 but facing a problem in reading the ".conf" file. Getting the error

Scala代码:

val configFile1 = ConfigFactory.load( "s3n://<bucket_name>/aws.conf" )
configFile1.getString("spark.lineage.key")

推荐答案

我最终要做的是,创建一个包装器实用程序 Config.scala

Here what I end up doing it, Create a wrapper utility Config.scala

import java.io.File

import com.amazonaws.auth.DefaultAWSCredentialsProviderChain
import com.amazonaws.services.s3.{AmazonS3Client, AmazonS3URI}
import com.typesafe.config.{ConfigFactory, Config => TConfig}

import scala.io.Source

object Config {

  private def read(location: String): String = {
    val awsCredentials = new DefaultAWSCredentialsProviderChain()
    val s3Client = new AmazonS3Client(awsCredentials)
    val s3Uri = new AmazonS3URI(location)

    val fullObject = s3Client.getObject(s3Uri.getBucket, s3Uri.getKey)

    Source.fromInputStream(fullObject.getObjectContent).getLines.mkString("\n")
  }

  def apply(location: String): TConfig = {

    if (location.startsWith("s3")) {
      val content = read(location)
      ConfigFactory.parseString(content)
    } else {
      ConfigFactory.parseFile(new File(location))
    }
  }
}

使用创建的包装器

val conf: TConfig = Config("s3://config/path")

您可以将提供的范围用于 aws-java-sdk ,因为它将在EMR群集中可用.

You may use provided scope for aws-java-sdk since it will be available in the EMR cluster.

这篇关于通过Spark和Scala从AWS s3读取.conf文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-11 13:32