我在snakemake工作流程中包含了一些规则的benchmark指令,并且生成的文件具有以下 header :

s   h:m:s   max_rss max_vms max_uss max_pss io_in   io_out  mean_load

The only documentation I've found提到“基准txt文件(它将包含MiB中运行时间和内存使用情况的制表符分隔的表)”。

我可以猜测,第1列和第2列是两种不同的方式来显示执行规则所花费的时间(以秒为单位,并转换为小时,分钟和秒)。
io_inio_out可能与磁盘的读写事件有关,但是以什么单位来衡量?

还有什么?这是在某处记录的吗?

编辑:查看源代码

我在 /snakemake/benchmark.py 中找到了以下代码,这很可能是基准数据的来源:

def _update_record(self):
    """Perform the actual measurement"""
    # Memory measurements
    rss, vms, uss, pss = 0, 0, 0, 0
    # I/O measurements
    io_in, io_out = 0, 0
    # CPU seconds
    cpu_seconds = 0
    # Iterate over process and all children
    try:
        main = psutil.Process(self.pid)
        this_time = time.time()
        for proc in chain((main,), main.children(recursive=True)):
            meminfo = proc.memory_full_info()
            rss += meminfo.rss
            vms += meminfo.vms
            uss += meminfo.uss
            pss += meminfo.pss
            ioinfo = proc.io_counters()
            io_in += ioinfo.read_bytes
            io_out += ioinfo.write_bytes
            if self.bench_record.prev_time:
                cpu_seconds += proc.cpu_percent() / 100 * (
                    this_time - self.bench_record.prev_time)
        self.bench_record.prev_time = this_time
        if not self.bench_record.first_time:
            self.bench_record.prev_time = this_time
        rss /= 1024 * 1024
        vms /= 1024 * 1024
        uss /= 1024 * 1024
        pss /= 1024 * 1024
        io_in /= 1024 * 1024
        io_out /= 1024 * 1024
    except psutil.Error as e:
        return
    # Update benchmark record's RSS and VMS
    self.bench_record.max_rss = max(self.bench_record.max_rss or 0, rss)
    self.bench_record.max_vms = max(self.bench_record.max_vms or 0, vms)
    self.bench_record.max_uss = max(self.bench_record.max_uss or 0, uss)
    self.bench_record.max_pss = max(self.bench_record.max_pss or 0, pss)
    self.bench_record.io_in = io_in
    self.bench_record.io_out = io_out
    self.bench_record.cpu_seconds += cpu_seconds

因此,这似乎来自 psutil 提供的功能。

最佳答案

肯定可以更好地记录蛇标中的基准测试,但是psutil是占主导地位的here:

get_memory_info()
Return a tuple representing RSS (Resident Set Size) and VMS (Virtual Memory Size) in bytes.
On UNIX RSS and VMS are the same values shown by ps.
On Windows RSS and VMS refer to "Mem Usage" and "VM Size" columns of taskmgr.exe.

psutil.disk_io_counters(perdisk=False)

Return system disk I/O statistics as a namedtuple including the following attributes:
    read_count: number of reads
    write_count: number of writes
    read_bytes: number of bytes read
    write_bytes: number of bytes written
    read_time: time spent reading from disk (in milliseconds)
    write_time: time spent writing to disk (in milliseconds)

您找到的代码确认所有内存使用情况和IO计数均以MB(=字节* 1024 * 1024)为单位报告。

关于psutil - Snakemake中基准变量的含义,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/46813371/

10-12 21:56