从案例类生成Spark

从案例类生成Spark

本文介绍了从案例类生成Spark StructType/Schema的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我想从case class中创建一个StructType(即一个DataFrame.schema),是否有一种方法可以在不创建DataFrame的情况下进行呢?我可以轻松做到:

If I wanted to create a StructType (i.e. a DataFrame.schema) out of a case class, is there a way to do it without creating a DataFrame? I can easily do:

case class TestCase(id: Long)
val schema = Seq[TestCase]().toDF.schema

但是当我想要的只是模式时,实际上创建DataFrame似乎有些矫kill过正.

But it seems overkill to actually create a DataFrame when all I want is the schema.

(如果您很好奇,那么问题背后的原因是我正在定义一个UserDefinedAggregateFunction,为此您可以重写几个返回StructTypes的方法,并使用案例类.)

(If you are curious, the reason behind the question is that I am defining a UserDefinedAggregateFunction, and to do so you override a couple of methods that return StructTypes and I use case classes.)

推荐答案

您可以用可以做到:

You can do it the same way SQLContext.createDataFrame does it:

import org.apache.spark.sql.catalyst.ScalaReflection
val schema = ScalaReflection.schemaFor[TestCase].dataType.asInstanceOf[StructType]

这篇关于从案例类生成Spark StructType/Schema的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-14 02:27