问题描述
我正在读取Azure数据砖中的实木复合地板文件:使用SparkR>read.parquet()使用Sparklyr>spark_read_parquet()这两个数据帧是不同的,是否有任何方法可以将SparkR数据帧转换为sparklyr数据帧,反之亦然?
I am reading a parquet file in Azure databricks:Using SparkR > read.parquet()Using Sparklyr > spark_read_parquet()Both the dataframes are different, Is there any way to convert SparkR dataframe into the sparklyr dataframe and vice-versa ?
推荐答案
sparklyr创建tbl_spark.从本质上讲,这只是用Spark SQL编写的惰性查询.SparkR创建一个SparkDataFrame,它更多是使用计划组织的数据集合.
sparklyr creates tbl_spark. This is essentially just a lazy query written in Spark SQL. SparkR creates a SparkDataFrame which is more of a collection of data that is organized using a plan.
以相同的方式,您不能将tbl用作普通数据.不能以与SparkDataFrame相同的方式使用tbl_spark.
In the same way you can't use a tbl as a normal data.frame you can't use a tbl_spark the same way as a SparkDataFrame.
我唯一想到的一种将其转换为数据湖/数据仓库或首先将其读入r的方法.
The only way I can think of to convert one to the other would be to write it to your data lake/ data warehouse or read it into r first.
这篇关于使用SparkR创建的数据帧和使用Sparklyr创建的数据帧之间有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!