问题描述
我试图自动包括罐子我PySpark类路径。现在我可以键入以下命令和它的作品:
$ pyspark --jars /path/to/my.jar
我想有默认包含的罐子,这样我只能输入 pyspark
以及在IPython的笔记本电脑中使用它。
我读过,我可以通过在ENV设置PYSPARK_SUBMIT_ARGS包括参数:
出口PYSPARK_SUBMIT_ARGS = - 罐子/path/to/my.jar
不幸的是,上述方法无效。我得到的运行时错误无法加载数据源
。
运行星火1.3.1。
修改
我的解决方法使用IPython的笔记本电脑时,如下:
$ IPYTHON_OPTS =记事本pyspark --jars /path/to/my.jar
您可以在火花defaults.conf 的文件(位于火花安装的conf文件夹)添加的jar文件。如果在坛子里列表中有多个条目,请使用:作为分隔符
。 spark.driver.extraClassPath /path/to/my.jar
这个属性是在https://spark.apache.org/docs/1.3.1/configuration.html#runtime-environment
I'm trying to automatically include jars to my PySpark classpath. Right now I can type the following command and it works:
$ pyspark --jars /path/to/my.jar
I'd like to have that jar included by default so that I can only type pyspark
and also use it in IPython Notebook.
I've read that I can include the argument by setting PYSPARK_SUBMIT_ARGS in env:
export PYSPARK_SUBMIT_ARGS="--jars /path/to/my.jar"
Unfortunately the above doesn't work. I get the runtime error Failed to load class for data source
.
Running Spark 1.3.1.
Edit
My workaround when using IPython Notebook is the following:
$ IPYTHON_OPTS="notebook" pyspark --jars /path/to/my.jar
You can add the jar files in the spark-defaults.conf file (located in the conf folder of your spark installation). If there is more than one entry in the jars list, use : as separator.
spark.driver.extraClassPath /path/to/my.jar
This property is documented in https://spark.apache.org/docs/1.3.1/configuration.html#runtime-environment
这篇关于包括自动罐子PySpark类路径的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!