问题描述
我正在尝试在一个ruffus管道中使用Sailfish,它使用多个fastq文件作为参数.我使用python中的子进程模块执行Sailfish,但是即使设置了shell=True
,子进程调用中的<()
也无法正常工作.
I am trying to use Sailfish, which takes multiple fastq files as arguments, in a ruffus pipeline. I execute Sailfish using the subprocess module in python, but <()
in the subprocess call does not work even when I set shell=True
.
这是我要使用python执行的命令:
This is the command I want to execute using python:
sailfish quant [options] -1 <(cat sample1a.fastq sample1b.fastq) -2 <(cat sample2a.fastq sample2b.fastq) -o [output_file]
或(最好):
sailfish quant [options] -1 <(gunzip sample1a.fastq.gz sample1b.fastq.gz) -2 <(gunzip sample2a.fastq.gz sample2b.fastq.gz) -o [output_file]
概括:
someprogram <(someprocess) <(someprocess)
我将如何在python中执行此操作?子过程是正确的方法吗?
How would I go about doing this in python? Is subprocess the right approach?
推荐答案
要模拟 bash进程替换:
#!/usr/bin/env python
from subprocess import check_call
check_call('someprogram <(someprocess) <(anotherprocess)',
shell=True, executable='/bin/bash')
在Python中,您可以使用命名管道:
In Python, you could use named pipes:
#!/usr/bin/env python
from subprocess import Popen
with named_pipes(n=2) as paths:
someprogram = Popen(['someprogram'] + paths)
processes = []
for path, command in zip(paths, ['someprocess', 'anotherprocess']):
with open(path, 'wb', 0) as pipe:
processes.append(Popen(command, stdout=pipe, close_fds=True))
for p in [someprogram] + processes:
p.wait()
其中named_pipes(n)
是:
import os
import shutil
import tempfile
from contextlib import contextmanager
@contextmanager
def named_pipes(n=1):
dirname = tempfile.mkdtemp()
try:
paths = [os.path.join(dirname, 'named_pipe' + str(i)) for i in range(n)]
for path in paths:
os.mkfifo(path)
yield paths
finally:
shutil.rmtree(dirname)
实现bash进程替换的另一种更可取的方式(无需在磁盘上创建命名条目)是将/dev/fd/N
文件名(如果可用)用作.在FreeBSD上, fdescfs(5)
( /dev/fd/#
)为该进程打开的所有文件描述符创建条目.要测试可用性,请运行:
Another and more preferable way (no need to create a named entry on disk) to implement the bash process substitution is to use /dev/fd/N
filenames (if they are available) as suggested by @Dunes. On FreeBSD, fdescfs(5)
(/dev/fd/#
) creates entries for all file descriptors opened by the process. To test availability, run:
$ test -r /dev/fd/3 3</dev/null && echo /dev/fd is available
如果失败;尝试将/dev/fd
符号链接到 proc(5)
在某些Linux上完成:
If it fails; try to symlink /dev/fd
to proc(5)
as it is done on some Linuxes:
$ ln -s /proc/self/fd /dev/fd
这是someprogram <(someprocess) <(anotherprocess)
bash命令的基于/dev/fd
的实现:
Here's /dev/fd
-based implementation of someprogram <(someprocess) <(anotherprocess)
bash command:
#!/usr/bin/env python3
from contextlib import ExitStack
from subprocess import CalledProcessError, Popen, PIPE
def kill(process):
if process.poll() is None: # still running
process.kill()
with ExitStack() as stack: # for proper cleanup
processes = []
for command in [['someprocess'], ['anotherprocess']]: # start child processes
processes.append(stack.enter_context(Popen(command, stdout=PIPE)))
stack.callback(kill, processes[-1]) # kill on someprogram exit
fds = [p.stdout.fileno() for p in processes]
someprogram = stack.enter_context(
Popen(['someprogram'] + ['/dev/fd/%d' % fd for fd in fds], pass_fds=fds))
for p in processes: # close pipes in the parent
p.stdout.close()
# exit stack: wait for processes
if someprogram.returncode != 0: # errors shouldn't go unnoticed
raise CalledProcessError(someprogram.returncode, someprogram.args)
注意:在我的Ubuntu机器上,尽管pass_fds
自Python 3.2开始可用,但subprocess
代码仅在Python 3.4+中有效.
Note: on my Ubuntu machine, the subprocess
code works only in Python 3.4+, despite pass_fds
being available since Python 3.2.
这篇关于子流程中有多个管道的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!