问题描述
dataproc spark作业日志位于何处?我知道在日志记录"部分下有来自驱动程序的日志,但是执行节点呢?另外,Spark正在执行的详细步骤记录在哪里(我知道我可以在Application Master中看到它们)?我正在尝试调试似乎挂起的脚本,并且似乎冻结了火花.
Where are the dataproc spark job logs located? I know there are logs from the driver under "Logging" section but what about the execution nodes? Also, where are the detailed steps that Spark is executing logged (I know I can see them in the Application Master)? I am attempting to debug a script that seems to hang and spark seems to freeze.
推荐答案
任务日志存储在/tmp
下的每个工作节点上.
The task logs are stored on each worker node under /tmp
.
可以通过纱线记录汇总将它们收集在一个地方.在集群创建时设置这些属性(通过带有yarn:
前缀的--properties
):
It is possible to collect them in one place via yarn log aggregation. Set these properties at cluster creation time (via --properties
with yarn:
prefix):
-
yarn.log-aggregation-enable=true
-
yarn.nodemanager.remote-app-log-dir=gs://${LOG_BUCKET}/logs
-
yarn.log-aggregation.retain-seconds=-1
yarn.log-aggregation-enable=true
yarn.nodemanager.remote-app-log-dir=gs://${LOG_BUCKET}/logs
yarn.log-aggregation.retain-seconds=-1
这是一篇讨论日志管理的文章:
Here's an article that discusses log management:
https://hortonworks.com/Blog/simplifying-user-logs-management-and-access-in-yarn/
这篇关于各个dataproc火花日志在哪里?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!