SparklyR从Spark上下文中删除表

SparklyR从Spark上下文中删除表

本文介绍了SparklyR从Spark上下文中删除表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

想从Spark上下文('sc')删除单个数据表.我知道可以取消缓存单个缓存的表,但这与从sc删除对象不同-据我所知.

Would like to remove a single data table from the Spark Context ('sc'). I know a single cached table can be un-cached, but this isn't the same as removing an object from the sc -- as far as I can gather.

library(sparklyr)
library(dplyr)
library(titanic)
library(Lahman)

spark_install(version = "2.0.0")
sc <- spark_connect(master = "local")

batting_tbl <- copy_to(sc, Lahman::Batting, "batting")
titanic_tbl <- copy_to(sc, titanic_train, "titanic", overwrite = TRUE)
src_tbls(sc)
# [1] "batting" "titanic"

tbl_cache(sc, "batting") # Speeds up computations -- loaded into memory
src_tbls(sc)
# [1] "batting" "titanic"

tbl_uncache(sc, "batting")
src_tbls(sc)
# [1] "batting" "titanic"

要断开整个sc的连接,我将使用spark_disconnect(sc),但是在此示例中,它将破坏sc内存储的"titanic"表和"batt"表.

To disconnect the complete sc, I would use spark_disconnect(sc), but in this example it would destroy both "titanic" and "batting" tables stored inside of sc.

相反,我想删除诸如spark_disconnect(sc, tableToRemove = "batting")之类的"batting",但这似乎是不可能的.

Rather, I would like to delete e.g., "batting" with something like spark_disconnect(sc, tableToRemove = "batting"), but this doesn't seem possible.

推荐答案

dplyr::db_drop_table(sc, "batting")

我尝试了此功能,似乎可以正常工作.

I tried this function and it seems work.

这篇关于SparklyR从Spark上下文中删除表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-01 21:14