问题描述
Spark体系结构完全围绕执行程序和内核的概念展开。我想实际上看到在集群中运行的Spark应用程序运行多少个执行程序和内核。我试着在我的应用程序中使用下面的代码片段,但没有运气。
val sc = new SparkContext(conf)
conf.get(spark.executor.instances)
conf .get(spark.executor.cores)
有没有办法使用 SparkContext
对象或 SparkConf
对象等。
Scala(编程方式):
getExecutorStorageStatus
和 getExecutorMemoryStatus
都返回执行者的数量,包括驱动程序。
类似于下面的示例代码片段。
/ **返回当前活动/注册执行程序的方法
*不包括驱动程序。
* @param sc用于检索注册执行程序的spark上下文。
* @返回host:port形式的执行程序列表。
* /
def currentActiveExecutors(sc:SparkContext):Seq [String] = {
val allExecutors = sc.getExecutorMemoryStatus.map(_._ 1)
val driverHost:String = sc.getConf.get(spark.driver.host)
allExecutors.filter(!_.split(:)(0).equals(driverHost))。toList
}
sc.getConf.getInt(spark.executor.instances,1)
同样获得所有属性并打印如下,你也可以获得核心信息。
sc.getConf.getAll.mkString( \\\
)
OR
sc.getConf.toDebugString
主要是 spark.executor.cores
为执行者 spark.driver.cores
驱动程序应该具有此值。
Python:
编辑
但是可以使用从SparkSession暴露的Py4J绑定进行访问。
sc._jsc.sc()。getExecutorMemoryStatus()
Spark architecture is entirely revolves around the concept of executors and cores. I would like to see practically how many executors and cores running for my spark application running in a cluster.
I was trying to use below snippet in my application but no luck.
val conf = new SparkConf().setAppName("ExecutorTestJob")
val sc = new SparkContext(conf)
conf.get("spark.executor.instances")
conf.get("spark.executor.cores")
Is there any way to get those values using SparkContext
Object or SparkConf
object etc..
Scala (Programmatic way) :
getExecutorStorageStatus
and getExecutorMemoryStatus
both return the number of executors including driver.like below example snippet.
/** Method that just returns the current active/registered executors
* excluding the driver.
* @param sc The spark context to retrieve registered executors.
* @return a list of executors each in the form of host:port.
*/
def currentActiveExecutors(sc: SparkContext): Seq[String] = {
val allExecutors = sc.getExecutorMemoryStatus.map(_._1)
val driverHost: String = sc.getConf.get("spark.driver.host")
allExecutors.filter(! _.split(":")(0).equals(driverHost)).toList
}
sc.getConf.getInt("spark.executor.instances", 1)
similarly get all properties and print like below you may get cores information as well..
sc.getConf.getAll.mkString("\n")
OR
sc.getConf.toDebugString
Mostly spark.executor.cores
for executors spark.driver.cores
driver should have this value.
Python :
EDITBut can be accessed using Py4J bindings exposed from SparkSession.
sc._jsc.sc().getExecutorMemoryStatus()
这篇关于Spark - 将多少执行程序和内核分配给我的Spark任务的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!