问题描述
我正在尝试在 Spark 1.1.0 中使用 Spark Cassandra 连接器.
I am trying to use Spark Cassandra Connector in Spark 1.1.0.
我已经成功地从 GitHub 上的 master 分支构建了 jar 文件,并且已经让包含的演示工作.但是,当我尝试将 jar 文件加载到 spark-shell
中时,我无法从 com.datastax.spark.connector
包中导入任何类.
I have successfully built the jar file from the master branch on GitHub and have gotten the included demos to work. However, when I try to load the jar files into the spark-shell
I can't import any of the classes from the com.datastax.spark.connector
package.
我尝试在 spark-shell
上使用 --jars
选项,并将带有 jar 文件的目录添加到 Java 的 CLASSPATH.这些选项都不起作用.事实上,当我使用 --jars
选项时,日志输出显示 Datastax jar 正在加载,但我仍然无法从 com.datastax
导入任何内容.
I have tried using the --jars
option on spark-shell
and adding the directory with the jar file to Java's CLASSPATH. Neither of these options work. In fact, when I use the --jars
option, the logging output shows that the Datastax jar is getting loaded, but I still cannot import anything from com.datastax
.
我已经能够使用 --jars
将 Tuplejump Calliope Cassandra 连接器加载到 spark-shell
中,所以我知道这是有效的.只是 Datastax 连接器对我来说失败了.
I have been able to load the Tuplejump Calliope Cassandra connector into the spark-shell
using --jars
, so I know that's working. It's just the Datastax connector which is failing for me.
推荐答案
我明白了.以下是我所做的:
I got it. Below is what I did:
$ git clone https://github.com/datastax/spark-cassandra-connector.git
$ cd spark-cassandra-connector
$ sbt/sbt assembly
$ $SPARK_HOME/bin/spark-shell --jars ~/spark-cassandra-connector/spark-cassandra-connector/target/scala-2.10/connector-assembly-1.2.0-SNAPSHOT.jar
在 Scala 提示符中,
In scala prompt,
scala> sc.stop
scala> import com.datastax.spark.connector._
scala> import org.apache.spark.SparkContext
scala> import org.apache.spark.SparkContext._
scala> import org.apache.spark.SparkConf
scala> val conf = new SparkConf(true).set("spark.cassandra.connection.host", "my cassandra host")
scala> val sc = new SparkContext("spark://spark host:7077", "test", conf)
这篇关于如何在外壳中加载 Spark Cassandra 连接器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!