本文介绍了来自uniprot蛋白质ID python的蛋白质序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道是否有办法从uniprot蛋白质id中获得蛋白质序列.我确实检查了很少的在线软件,但它们一次只能获得一个序列,但是我有5536个vlues.biopython中是否有任何软件包可以做到这一点?

I was wondering if there is way to get the sequence of proteins from uniprot protein ids. I did check few online softwares but they allow to get one sequence at a time but I have 5536 vlues. Is there any package in biopython to do this?

推荐答案

uniprot的所有序列都可以从" http://www.uniprot.org/uniprot/" + UniprotID + .fasta.您可以使用

All the sequences from uniprot can be accesed from "http://www.uniprot.org/uniprot/" + UniprotID +.fasta. You can obtain any sequence with

import requests as r
from Bio import SeqIO
from io import StringIO

cID='P04637'

baseUrl="http://www.uniprot.org/uniprot/"
currentUrl=baseUrl+cID+".fasta"
response = r.post(currentUrl)
cData=''.join(response.text)

Seq=StringIO(cData)
pSeq=list(SeqIO.parse(Seq,'fasta'))

cID可以是一个列表,也可以是一个条目,如果您循环通过错误列表,则只需在下载之间增加延迟,以免使服务器饱和.希望对您有帮助

cID can be a list or a single entry, if you loop trough a bug list just add a delay between downloads, trying not to saturate the server. Hope it helps

这篇关于来自uniprot蛋白质ID python的蛋白质序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

05-26 09:57