我在snakemake工作流程中包含了一些规则的benchmark
指令,并且生成的文件具有以下 header :
s h:m:s max_rss max_vms max_uss max_pss io_in io_out mean_load
The only documentation I've found提到“基准txt文件(它将包含MiB中运行时间和内存使用情况的制表符分隔的表)”。
我可以猜测,第1列和第2列是两种不同的方式来显示执行规则所花费的时间(以秒为单位,并转换为小时,分钟和秒)。
io_in
和io_out
可能与磁盘的读写事件有关,但是以什么单位来衡量?还有什么?这是在某处记录的吗?
编辑:查看源代码
我在
/snakemake/benchmark.py
中找到了以下代码,这很可能是基准数据的来源:def _update_record(self):
"""Perform the actual measurement"""
# Memory measurements
rss, vms, uss, pss = 0, 0, 0, 0
# I/O measurements
io_in, io_out = 0, 0
# CPU seconds
cpu_seconds = 0
# Iterate over process and all children
try:
main = psutil.Process(self.pid)
this_time = time.time()
for proc in chain((main,), main.children(recursive=True)):
meminfo = proc.memory_full_info()
rss += meminfo.rss
vms += meminfo.vms
uss += meminfo.uss
pss += meminfo.pss
ioinfo = proc.io_counters()
io_in += ioinfo.read_bytes
io_out += ioinfo.write_bytes
if self.bench_record.prev_time:
cpu_seconds += proc.cpu_percent() / 100 * (
this_time - self.bench_record.prev_time)
self.bench_record.prev_time = this_time
if not self.bench_record.first_time:
self.bench_record.prev_time = this_time
rss /= 1024 * 1024
vms /= 1024 * 1024
uss /= 1024 * 1024
pss /= 1024 * 1024
io_in /= 1024 * 1024
io_out /= 1024 * 1024
except psutil.Error as e:
return
# Update benchmark record's RSS and VMS
self.bench_record.max_rss = max(self.bench_record.max_rss or 0, rss)
self.bench_record.max_vms = max(self.bench_record.max_vms or 0, vms)
self.bench_record.max_uss = max(self.bench_record.max_uss or 0, uss)
self.bench_record.max_pss = max(self.bench_record.max_pss or 0, pss)
self.bench_record.io_in = io_in
self.bench_record.io_out = io_out
self.bench_record.cpu_seconds += cpu_seconds
因此,这似乎来自
psutil
提供的功能。 最佳答案
肯定可以更好地记录蛇标中的基准测试,但是psutil是占主导地位的here:
get_memory_info()
Return a tuple representing RSS (Resident Set Size) and VMS (Virtual Memory Size) in bytes.
On UNIX RSS and VMS are the same values shown by ps.
On Windows RSS and VMS refer to "Mem Usage" and "VM Size" columns of taskmgr.exe.
psutil.disk_io_counters(perdisk=False)
Return system disk I/O statistics as a namedtuple including the following attributes:
read_count: number of reads
write_count: number of writes
read_bytes: number of bytes read
write_bytes: number of bytes written
read_time: time spent reading from disk (in milliseconds)
write_time: time spent writing to disk (in milliseconds)
您找到的代码确认所有内存使用情况和IO计数均以MB(=字节* 1024 * 1024)为单位报告。
关于psutil - Snakemake中基准变量的含义,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/46813371/