子进程中的多个管道

本文介绍了子进程中的多个管道的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试在 ruffus 管道中使用 Sailfish，它将多个 fastq 文件作为参数.我使用python中的子进程模块执行Sailfish，但是即使我设置了shell=True，子进程调用中的<()也不起作用.

I am trying to use Sailfish, which takes multiple fastq files as arguments, in a ruffus pipeline. I execute Sailfish using the subprocess module in python, but <() in the subprocess call does not work even when I set shell=True.

这是我要使用 python 执行的命令:

This is the command I want to execute using python:

sailfish quant [options] -1 <(cat sample1a.fastq sample1b.fastq) -2 <(cat sample2a.fastq sample2b.fastq) -o [output_file]

或(最好):

sailfish quant [options] -1 <(gunzip sample1a.fastq.gz sample1b.fastq.gz) -2 <(gunzip sample2a.fastq.gz sample2b.fastq.gz) -o [output_file]

概括:

someprogram <(someprocess) <(someprocess)

我将如何在 python 中执行此操作?子流程是正确的方法吗?

How would I go about doing this in python? Is subprocess the right approach?

推荐答案

模拟 bash进程替换:

#!/usr/bin/env python
from subprocess import check_call

check_call('someprogram <(someprocess) <(anotherprocess)',
           shell=True, executable='/bin/bash')

在 Python 中，您可以使用命名管道:

In Python, you could use named pipes:

#!/usr/bin/env python
from subprocess import Popen

with named_pipes(n=2) as paths:
    someprogram = Popen(['someprogram'] + paths)
    processes = []
    for path, command in zip(paths, ['someprocess', 'anotherprocess']):
        with open(path, 'wb', 0) as pipe:
            processes.append(Popen(command, stdout=pipe, close_fds=True))
    for p in [someprogram] + processes:
        p.wait()

其中 named_pipes(n) 是:

import os
import shutil
import tempfile
from contextlib import contextmanager

@contextmanager
def named_pipes(n=1):
    dirname = tempfile.mkdtemp()
    try:
        paths = [os.path.join(dirname, 'named_pipe' + str(i)) for i in range(n)]
        for path in paths:
            os.mkfifo(path)
        yield paths
    finally:
        shutil.rmtree(dirname)

实现 bash 进程替换的另一种更可取的方式(无需在磁盘上创建命名条目)是使用 /dev/fd/N 文件名(如果可用)作为 @Dunes 建议.在 FreeBSD 上，fdescfs(5) (/dev/fd/#) 为进程打开的所有文件描述符创建条目.要测试可用性，请运行:

Another and more preferable way (no need to create a named entry on disk) to implement the bash process substitution is to use /dev/fd/N filenames (if they are available) as suggested by @Dunes. On FreeBSD, fdescfs(5) (/dev/fd/#) creates entries for all file descriptors opened by the process. To test availability, run:

$ test -r /dev/fd/3 3</dev/null && echo /dev/fd is available

如果失败；尝试将 /dev/fd 符号链接到 proc(5) 就像在某些 Linux 上所做的那样:

If it fails; try to symlink /dev/fd to proc(5) as it is done on some Linuxes:

$ ln -s /proc/self/fd /dev/fd

这是基于 /dev/fd 的 someprogram <(someprocess) <(anotherprocess) bash 命令的实现:

Here's /dev/fd-based implementation of someprogram <(someprocess) <(anotherprocess) bash command:

#!/usr/bin/env python3
from contextlib import ExitStack
from subprocess import CalledProcessError, Popen, PIPE

def kill(process):
    if process.poll() is None: # still running
        process.kill()

with ExitStack() as stack: # for proper cleanup
    processes = []
    for command in [['someprocess'], ['anotherprocess']]:  # start child processes
        processes.append(stack.enter_context(Popen(command, stdout=PIPE)))
        stack.callback(kill, processes[-1]) # kill on someprogram exit

    fds = [p.stdout.fileno() for p in processes]
    someprogram = stack.enter_context(
        Popen(['someprogram'] + ['/dev/fd/%d' % fd for fd in fds], pass_fds=fds))
    for p in processes: # close pipes in the parent
        p.stdout.close()
# exit stack: wait for processes
if someprogram.returncode != 0: # errors shouldn't go unnoticed
   raise CalledProcessError(someprogram.returncode, someprogram.args)

注意:在我的 Ubuntu 机器上，subprocess 代码仅适用于 Python 3.4+，尽管 pass_fds 自 Python 3.2 起可用.

Note: on my Ubuntu machine, the subprocess code works only in Python 3.4+, despite pass_fds being available since Python 3.2.

这篇关于子进程中的多个管道的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！