问题描述
如何在Apache Spark SQL中绑定变量?例如:
How to bind variable in Apache Spark SQL? For example:
val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
sqlContext.sql("SELECT * FROM src WHERE col1 = ${VAL1}").collect().foreach(println)
推荐答案
Spark SQL(从1.6版本开始)不支持绑定变量.
Spark SQL (as of 1.6 release) does not support bind variables.
ps. Ashrith暗示的不是绑定变量.您每次都在构造一个字符串. Evey时间Spark将解析查询,创建执行计划等.绑定变量(例如,在RDBMS系统中)的目的是减少创建执行计划的时间(在有大量联接等情况下可能会很昂贵). Spark必须具有特殊的API才能解析"查询,然后绑定"变量. Spark不具有此功能(截至今天,Spark 1.6版本).
ps. What Ashrith is suggesting is not a bind variable.. You're constructing a string every time. Evey time Spark will parse the query, create execution plan etc. Purpose of bind variables (in RDBMS systems for example) is to cut time on creating execution plan (which can be costly where there are a lot of joins etc). Spark has to have a special API to "parse" a query and then to "bind" variables. Spark does not have this functionality (as of today, Spark 1.6 release).
更新8/2018 :从Spark 2.3开始,Spark中(仍然)没有绑定变量.
Update 8/2018: as of Spark 2.3 there are (still) no bind variables in Spark.
这篇关于动态绑定变量/参数在Spark SQL中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!