问题描述
Apache Livy文档稀疏:是否可以使用Apache Livy将Spark SQL查询结果集作为REST调用返回?调用应用程序是移动的,并且无法使用odbc/jdbc进行连接.因此,Spark Thriftserver不是一个选择.
The apache Livy documentation is sparse: is it possible to return Spark SQL query resultsets as REST calls using Apache Livy? The calling application is mobile and it cannot use odbc/jdbc to connect. So the Spark thriftserver is not an option.
推荐答案
是的,可以通过Livy提交Spark SQL查询.但是,[当前]不支持自行提交的查询.它们将需要用Python或Scala代码包装.
Yes, it is possible to submit Spark SQL queries through Livy. However, there is [currently] no support for the queries being submitted on their own. They would need to be wrapped in Python or Scala code.
以下是两个示例,这些示例使用Python通过请求lib和Scala代码作为要在火花中"执行的字符串与Livy交互来执行Spark SQL查询:
Here are two examples of executing Spark SQL queries using Python to interact with Livy via requests lib and Scala code as a string to be executed "in spark":
1)在livy中使用%json魔术( https://github.com/apache/incubator-livy/blob/412ccc8fcf96854fedbe76af8e5a6fec2c542d25/repl/src/test/scala/org/apache/livy/repl/PythonInterpreSpec. scala#L91 )
1) using %json magic in livy (https://github.com/apache/incubator-livy/blob/412ccc8fcf96854fedbe76af8e5a6fec2c542d25/repl/src/test/scala/org/apache/livy/repl/PythonInterpreterSpec.scala#L91)
session_url = host + "/sessions/1"
statements_url = session_url + '/statements'
data = {
'code': textwrap.dedent("""\
val d = spark.sql("SELECT COUNT(DISTINCT food_item) FROM food_item_tbl")
val e = d.collect
%json e
""")}
r = requests.post(statements_url, data=json.dumps(data), headers=headers)
print r.json()
2)在livy中使用%table魔术( https://github.com/apache/incubator-livy/blob/412ccc8fcf96854fedbe76af8e5a6fec2c542d25/repl/src/test/scala/org/apache/livy/repl/PythonInterpreterSpec. scala#L105 )
2) using %table magic in livy (https://github.com/apache/incubator-livy/blob/412ccc8fcf96854fedbe76af8e5a6fec2c542d25/repl/src/test/scala/org/apache/livy/repl/PythonInterpreterSpec.scala#L105)
session_url = host + "/sessions/21"
statements_url = session_url + '/statements'
data = {
'code': textwrap.dedent("""\
val x = List((1, "a", 0.12), (3, "b", 0.63))
%table x
""")}
r = requests.post(statements_url, data=json.dumps(data), headers=headers)
print r.json()
这篇关于Apache Livy:通过REST查询Spark SQL:可能吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!