问题描述
我正在azure数据工厂中运行管道,并且正在使用自定义单元来运行azure批处理活动.
I'm running a pipeline in azure data factory, and I'm using a custom cell to run a azure batch activity.
我运行的天蓝色批处理作业确实很大,我想监视我处于该作业的哪个阶段.在远程VM上,我通常使用 python
中的 logging
模块执行此操作.
The azure batch job I run is really big, and I would like to monitor which stage I'm in that job. On a remote VM, I typically do this using the logging
module in python
.
完成后,我可以获取作业的状态(即所有日志记录信息),但是我想在运行作业时获取它.
I am able to get the status of the jobs (i.e., all the logging information) when it has finished, but I would like to obtain it when running the job.
我该怎么做?
推荐答案
批处理会自动将stdout/stderr捕获到 stdout.txt
和 stderr.txt
中,以完成任务中的任务.任务目录.确保您定期刷新流,如果需要的话.您在这里有两个选择:
Batch automatically captures stdout/stderr into stdout.txt
and stderr.txt
for the task in the task directory. Make sure you periodically flush your streams, if needed. You have two options here:
- 在程序中执行逻辑(作为批处理任务执行),以定期将这些文件输出到其他可以查看的位置(例如,到Azure存储Blob).
- 在客户端上实施逻辑以定期调用 GetFile 并检索
stdout.txt
或stderr.txt
的新偏移量(ocp-range
标头).如果使用各种语言的SDK而不是REST,则它们都具有便捷的API.
- Implement logic within your program (executed as a Batch task) to periodically egress those files out to some other place where you can view (for example to Azure Storage Blob).
- Implement logic on your client to periodically call GetFile and retrieve new offsets (
ocp-range
header) of eitherstdout.txt
orstderr.txt
. Various language SDKs have convenience APIs if using those instead of REST.
这篇关于天蓝色批量服务中的定期stdout和stderr的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!