本文介绍了JavaPackage 对象不可调用错误:Pyspark的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
像 dataframe.show()、sqlContext.read.json 这样的操作工作正常,但大多数函数给出JavaPackage object is not callable error".例如:当我这样做时
Operations like dataframe.show() , sQLContext.read.json works fine , but most functions gives "JavaPackage object is not callable error" .eg : when i do
dataFrame.withColumn(field_name, monotonically_increasing_id())
出现错误
File "/tmp/spark-cd423f35-9572-45ee-b159-1b2732afa2a6/userFiles-3a6e1729-95f4-468b-914c-c706369bf2a6/Transformations.py", line 64, in add_id_column
self.dataFrame = self.dataFrame.withColumn(field_name, monotonically_increasing_id())
File "/home/himaprasoon/apps/spark-1.6.0-bin-hadoop2.6/python/pyspark/sql/functions.py", line 347, in monotonically_increasing_id
return Column(sc._jvm.functions.monotonically_increasing_id())
TypeError: 'JavaPackage' object is not callable
我正在使用 apache-zeppelin 解释器并将 py4j 添加到 python 路径.
I am using apache-zeppelin interpreter and have added py4j to python path.
当我这样做
import py4j
print(dir(py4j))
导入成功
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'compat', 'finalizer', 'java_collections', 'java_gateway', 'protocol', 'version']
当我尝试
print(sc._jvm.functions)
在 pyspark shell 中打印
in pyspark shell it prints
<py4j.java_gateway.JavaClass object at 0x7fdaf9727ba8>
但是当我在解释器中尝试此操作时,它会打印
But when I try this in my interpreter it prints
<py4j.java_gateway.JavaPackage object at 0x7f07cc3f77f0>
推荐答案
在 zeppelin 解释器代码中
In zeppelin interpreter code
java_import(gateway.jvm, "org.apache.spark.sql.*")
没有被执行.将此添加到导入修复了问题
was not getting executed. Adding this to the import fixed the issue
这篇关于JavaPackage 对象不可调用错误:Pyspark的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!