本文介绍了按命中位置对rps-blast结果进行排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从biopython开始,我对解析结果有疑问.我使用了教程参与进来在这里,这是我使用的代码:

I'm beginning with biopython and I have a question about parsing results. I used a tutorial to get involved in this and here is the code that I used:

from Bio.Blast import NCBIXML
for record in NCBIXML.parse(open("/Users/jcastrof/blast/pruebarpsb.xml")):
    if record.alignments:
        print "Query: %s..." % record.query[:60]
        for align in record.alignments:
            for hsp in align.hsps:
                print " %s HSP,e=%f, from position %i to %i" \
                      % (align.hit_id, hsp.expect, hsp.query_start, hsp.query_end)

获得的部分结果是:

 gnl|CDD|225858 HSP,e=0.000000, from position 32 to 1118
 gnl|CDD|225858 HSP,e=0.000000, from position 1775 to 2671
 gnl|CDD|214836 HSP,e=0.000000, from position 37 to 458
 gnl|CDD|214836 HSP,e=0.000000, from position 1775 to 2192
 gnl|CDD|214838 HSP,e=0.000000, from position 567 to 850

我想做的是按命中位置(Hsp_hit-from)对结果进行排序,如下所示:

And what I want to do is to sort that result by position of the hit (Hsp_hit-from), like this:

 gnl|CDD|225858 HSP,e=0.000000, from position 32 to 1118
 gnl|CDD|214836 HSP,e=0.000000, from position 37 to 458
 gnl|CDD|214838 HSP,e=0.000000, from position 567 to 850
 gnl|CDD|225858 HSP,e=0.000000, from position 1775 to 2671
 gnl|CDD|214836 HSP,e=0.000000, from position 1775 to 2192

我的rps-blast输入文件是一个* .xml文件.有建议继续吗?

My input file for rps-blast is a *.xml file.Any suggestion to proceed?

谢谢!

推荐答案

HSP列表只是一个Python列表,可以照常进行排序.试试:

The HSPs list is just a Python list, and can be sorted as usual. Try:

align.hsps.sort(key = lambda hsp: hsp.query_start)

但是,您正在处理一个嵌套列表(每个匹配项都有一个HSP列表),并且您想对所有它们进行排序.在此处创建自己的列表可能是最好的-像这样:

However, you are dealing with a nested list (each match has a list of HSPs), and you want to sort over all of them. Here making your own list might be best - something like this:

for record in ...:
    print "Query: %s..." % record.query[:60]
    hits = sorted((hsp.query_start, hsp.query_end, hsp.expect, align.hit_id) \
                   for hsp in align.hsps for align in record.alignments)
    for q_start, q_end, expect, hit_id in hits:
        print " %s HSP,e=%f, from position %i to %i" \
              % (hit_id, expect, q_start, q_end)

彼得

这篇关于按命中位置对rps-blast结果进行排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-14 23:09