问题描述
我想使用齐柏林飞艇(Zeppelin)在Redshift中浏览我的数据.一个带有Spark的小型EMR集群正在运行.我正在加载databricks的spark-redshift库
I want to explore my data in Redshift using notebook Zeppelin. A small EMR cluster with Spark is running behind. I am loading databricks' spark-redshift library
%dep
z.reset()
z.load("com.databricks:spark-redshift_2.10:0.6.0")
然后
import org.apache.spark.sql.DataFrame
val query = "..."
val url = "..."
val port=5439
val table = "..."
val database = "..."
val user = "..."
val password = "..."
val df: DataFrame = sqlContext.read
.format("com.databricks.spark.redshift")
.option("url", s"jdbc:redshift://${url}:$port/$database?user=$user&password=$password")
.option("query",query)
.option("tempdir", "s3n://.../tmp/data")
.load()
df.show
但是我得到了错误
java.lang.ClassNotFoundException: Could not load an Amazon Redshift JDBC driver; see the README for instructions on downloading and configuring the official Amazon driver
我添加了选项
option("jdbcdriver", "com.amazon.redshift.jdbc41.Driver")
但不是更好.我想我需要在某个地方指定redshift的JDBC驱动程序,就像我将--driver-class-path传递给spark-shell一样,但是如何使用齐柏林飞艇来做到这一点?
but not for the better. I think I need to specify redshift's JDBC driver somewhere like I would passing --driver-class-path to spark-shell, but how to do that with zeppelin?
推荐答案
您可以使用Zeppelin的依赖性加载机制,如果是Spark,则使用%dep
动态依赖加载程序
You can add external jars with dependencies like the JDBC driver using either Zeppelin's dependency-loading mechanism or, in case of Spark, using %dep
dynamic dependency loader
- 从Maven存储库递归加载库
- 从本地文件系统加载库
- 添加其他Maven存储库
- 自动将库添加到SparkCluster(可以关闭)
- Load libraries recursively from Maven repository
- Load libraries from local filesystem
- Add additional maven repository
- Automatically add libraries to SparkCluster (You can turn off)
后者看起来像:
%dep
// loads with all transitive dependencies from Maven repo
z.load("groupId:artifactId:version")
// or add artifact from filesystem
z.load("/path/to.jar")
,按照惯例,必须在注释的第一段中.
and by convention have to be in the first paragraph of the note.
这篇关于Zeppelin中的AWS Redshift驱动程序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!