问题描述
我正在尝试从python调用以下命令:
Hi I'm trying to call the following command from python:
comm -3 <(awk '{print $1}' File1.txt | sort | uniq) <(awk '{print $1}' File2.txt | sort | uniq) | grep -v "#" | sed "s/\t//g"
当comm命令的输入也通过管道传输时,如何进行呼叫?
How could I do the calling when the inputs for the comm command are also piped?
有一种简单而直接的方法吗?
Is there an easy and straight forward way to do it?
我尝试了子流程模块:
subprocess.call("comm -3 <(awk '{print $1}' File1.txt | sort | uniq) <(awk '{print $1}' File2.txt | sort | uniq) | grep -v '#' | sed 's/\t//g'")
没有成功,它说:OSError:[Errno 2]没有这样的文件或目录
Without success, it says:OSError: [Errno 2] No such file or directory
还是我必须分别创建不同的调用,然后使用PIPE传递它们,如子流程文档中所述:
Or do I have to create the different calls individually and then pass them using PIPE as it is described in the subprocess documentation:
p1 = Popen(["dmesg"], stdout=PIPE)
p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
output = p2.communicate()[0]
推荐答案
进程替换(<()
)是仅限bash的功能.因此,您需要一个外壳程序,但它不能仅仅是任何外壳程序(例如/bin/sh
,在非Windows平台上为shell=True
使用的外壳程序)-它需要为 bash . /p>
Process substitution (<()
) is bash-only functionality. Thus, you need a shell, but it can't be just any shell (like /bin/sh
, as used by shell=True
on non-Windows platforms) -- it needs to be bash.
subprocess.call(['bash', '-c', "comm -3 <(awk '{print $1}' File1.txt | sort | uniq) <(awk '{print $1}' File2.txt | sort | uniq) | grep -v '#' | sed 's/\t//g'"])
顺便说一句,如果要使用任意文件名进行此路由,请将其带外传递(如下所示:将_
作为$0
传递,将File1.txt
作为$1
传递,以及File2.txt
作为$2
):
By the way, if you're going to be going this route with arbitrary filenames, pass them out-of-band (as below: Passing _
as $0
, File1.txt
as $1
, and File2.txt
as $2
):
subprocess.call(['bash', '-c',
'''comm -3 <(awk '{print $1}' "$1" | sort | uniq) '''
''' <(awk '{print $1}' "$2" | sort | uniq) '''
''' | grep -v '#' | tr -d "\t"''',
'_', "File1.txt", "File2.txt"])
也就是说,最佳实践方法的确是您自己建立链.以下内容已在Python 3.6上进行了测试(请注意,需要subprocess.Popen
的pass_fds
参数以使通过/dev/fd/##
链接引用的文件描述符可用):
That said, the best-practices approach is indeed to set up the chain yourself. The below is tested with Python 3.6 (note the need for the pass_fds
argument to subprocess.Popen
to make the file descriptors referred to via /dev/fd/##
links available):
awk_filter='''! /#/ && !seen[$1]++ { print $1 }'''
p1 = subprocess.Popen(['awk', awk_filter],
stdin=open('File1.txt', 'r'),
stdout=subprocess.PIPE)
p2 = subprocess.Popen(['sort', '-u'],
stdin=p1.stdout,
stdout=subprocess.PIPE)
p3 = subprocess.Popen(['awk', awk_filter],
stdin=open('File2.txt', 'r'),
stdout=subprocess.PIPE)
p4 = subprocess.Popen(['sort', '-u'],
stdin=p3.stdout,
stdout=subprocess.PIPE)
p5 = subprocess.Popen(['comm', '-3',
('/dev/fd/%d' % (p2.stdout.fileno(),)),
('/dev/fd/%d' % (p4.stdout.fileno(),))],
pass_fds=(p2.stdout.fileno(), p4.stdout.fileno()),
stdout=subprocess.PIPE)
p6 = subprocess.Popen(['tr', '-d', '\t'],
stdin=p5.stdout,
stdout=subprocess.PIPE)
result = p6.communicate()
这是很多代码,但是(假设文件名在现实世界中已被参数化)它也是更安全的代码-您不会受到ShellShock之类的bug的攻击,启动外壳程序的简单操作,无需担心会带外传递变量以避免注入攻击(除非是脚本语言解释程序本身的命令参数(如awk
的上下文中))
This is a lot more code, but (assuming that the filenames are parameterized in the real world) it's also safer code -- you aren't vulnerable to bugs like ShellShock that are triggered by the simple act of starting a shell, and don't need to worry about passing variables out-of-band to avoid injection attacks (except in the context of arguments to commands -- like awk
-- that are scripting language interpreters themselves).
也就是说,要考虑的另一件事是仅在本机Python中实现整个事情.
That said, another thing to think about is just implementing the whole thing in native Python.
lines_1 = set(line.split()[0] for line in open('File1.txt', 'r') if not '#' in line)
lines_2 = set(line.split()[0] for line in open('File2.txt', 'r') if not '#' in line)
not_common = (lines_1 - lines_2) | (lines_2 - lines_1)
for line in sorted(not_common):
print line
这篇关于如何从python传递许多bash命令?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!