hadoop - Apache Spark:在SparkSql中，sql容易受到Sql Injection的攻击

This question already has an answer here:

Spark SQL security considerations

(1个答案)

3年前关闭。

场景:

假设 Hive 中有一个表，并使用Apache Spark 中的以下 SparkSql查询该表，其中表名作为参数传递并连接到查询。

在非分布式系统的情况下，我对SQL-Injection漏洞有基本的了解，在JDBC的上下文中，我了解在这种情况下createStatement / preparedStatement的用法。

但是在使用Sparksql的情况下，该代码容易受到攻击吗？有什么见解吗？
def main(args: Array[String]) { val sconf = new SparkConf().setAppName("TestApp") val sparkContext = new SparkContext(sconf) val hiveSqlContext = new org.apache.spark.sql.hive.HiveContext(sparkContext) val tableName = args(0) // passed as an argument val tableData = hiveSqlContext.sql("select IdNUm, Name from hiveSchemaName." + tableName + " where IdNum <> '' ") .map( x => (x.getString(0), x.getString(1)) ).collectAsMap() ................ ............... }

最佳答案

您可以在Spark 2.0中尝试以下操作:
def main(args: Array[String]) { val conf = new SparkConf() val sparkSession = SparkSession .builder() .appName("TestApp") .config(conf) .enableHiveSupport() .getOrCreate() val tableName = args(0) // passed as an argument val tableData = sparkSession .table(tableName) .select($"IdNum", $"Name") .filter($"IdNum" =!= "") .map( x => (x.getString(0), x.getString(1)) ).collectAsMap() ................ ...............

}`