本文介绍了pyspark sql: AttributeError: 'NoneType' 对象没有属性 'join'的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
def main(inputs, output):
sdf = spark.read.csv(inputs, schema=observation_schema)
sdf.registerTempTable('filtertable')
result = spark.sql("""
SELECT * FROM filtertable WHERE qflag IS NULL
""").show()
temp_max = spark.sql(""" SELECT date, station, value FROM filtertable WHERE (observation = 'TMAX')""").show()
temp_min = spark.sql(""" SELECT date, station, value FROM filtertable WHERE (observation = 'TMIN')""").show()
result = temp_max.join(temp_min, condition1).select(temp_max('date'), temp_max('station'), ((temp_max('TMAX')-temp_min('TMIN'))/10)).alias('Range'))
错误:
Traceback (most recent call last):
File "/Users/syedikram/Documents/temp_range_sql.py", line 96, in <module>
main(inputs, output)
File "/Users/syedikram/Documents/temp_range_sql.py", line 52, in main
result = temp_max.join(temp_min, condition1).select(temp_max('date'), temp_max('station'), ((temp_max('TMAX')-temp_min('TMIN')/10)).alias('Range'))
AttributeError: 'NoneType' object has no attribute 'join'
执行连接操作给了我 Nonetype 对象错误.在线查看没有帮助,因为 pyspark sql 的在线文档很少.我在这里做错了什么?
Performing on join operation gives me Nonetype object error. Looking online didn't help as there is little documentation online for pyspark sql.What am I doing wrong here?
推荐答案
从 temp_max
和 temp_min
中移除 .show()
因为 show
只打印一个字符串并且不返回任何东西(因此你得到 AttributeError: 'NoneType' object has no attribute 'join'
).
Remove the .show()
from temp_max
and temp_min
because show
only prints a string and does not return anything (hence you get AttributeError: 'NoneType' object has no attribute 'join'
).
这篇关于pyspark sql: AttributeError: 'NoneType' 对象没有属性 'join'的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!