我正在尝试在rsparkling session 中通过H2o使用某些机器学习功能(使用库sparklyr)。我正在运行hadoop集群。

考虑以下示例:

library(dplyr)
library(sparklyr)
library(rsparkling)
library(h2o)

#configure the spark session and connect
sc = spark_connect(master = 'yarn-client',
                   spark_home = '/usr/hdp/current/spark-client',
                   app_name = 'sparklyr',
                   config = list(
                     "sparklyr.shell.executor-memory" = "1G",
                     "sparklyr.shell.driver-memory"   = "4G",
                     "spark.driver.maxResultSize"     = "2G" # may need to transfer a lot of data into R
                   )
)

mtcars_tbl <- copy_to(sc, mtcars, "mtcars")

mtcars_hf <- as_h2o_frame(sc=sc,x=mtcars_tbl,name='h_cars')

我收到以下错误:
Error: java.lang.IllegalArgumentException: Unsupported argument: (spark.dynamicAllocation.enabled,true)
        at org.apache.spark.h2o.backends.internal.InternalBackendUtils$$anonfun$checkUnsupportedSparkOptions$1.apply(InternalBackendUtils.scala:48)
        at org.apache.spark.h2o.backends.internal.InternalBackendUtils$$anonfun$checkUnsupportedSparkOptions$1.apply(InternalBackendUtils.scala:40)
        at scala.collection.immutable.List.foreach(List.scala:318)
        at org.apache.spark.h2o.backends.internal.InternalBackendUtils$class.checkUnsupportedSparkOptions(InternalBackendUtils.scala:40)
        at org.apache.spark.h2o.backends.internal.InternalH2OBackend.checkUnsupportedSparkOptions(InternalH2OBackend.scala:31)
        at org.apache.spark.h2o.backends.internal.InternalH2OBackend.checkAndUpdateConf(InternalH2OBackend.scala:61)
        at org.apache.spark.h2o.H2OContext.<init>(H2OContext.scala:96)
        at org.apache.spark.h2o.H2OContext$.getOrCreate(H2OContext.scala:294)
        at org.apache.spark.h2o.H2OContext$.getOrCreate(H2OContext.scala:316)
        at org.apache.spark.h2o.H2OContext.getOrCreate(H2OContext.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at sparklyr.Invoke$.invoke(invoke.scala:94)
        at sparklyr.StreamHandler$.handleMethodCall(stream.scala:89)
        at sparklyr.StreamHandler$.read(stream.scala:55)
        at sparklyr.BackendHandler.channelRead0(handler.scala:49)
        at sparklyr.BackendHandler.channelRead0(handler.scala:14)
        at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:244)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
        at java.lang.Thread.run(Thread.java:745)

有什么想法吗?

最佳答案

目前,Sparkling Water / RSparkling不支持动态Spark集群。所以你只需要禁用它:
config = list("spark.dynamicAllocation.enabled" = "false")

08-25 07:24