问题描述
我一直在尝试修改python函数以计算字母组(而不是单个字母),但遇到了一些麻烦。这是我必须计算单个字母的代码:
def count_letters(str):
计数= {} c中的c的
:如果c的计数为
,如果c的计数为:
计数[c] + = 1
else:
个计数[c] = 1
个返回计数
个计数= count_letters(my_seq)
个print(counts)
该函数当前吐出每个字母的计数。现在,它显示以下内容:
{'C':23,'T':30,'G':30,'A':20 }
理想情况下,我希望它打印出以下内容:
{'CTA':2,'TAG':3,'CGC':1,'GAG':2 ...}
我是python的新手,事实证明这很困难。
使用。
从集合导入计数器
s = CTAACAAC
def chunk_string(s,n):
return [s [i:i + n] for i in range(len(s)-n + 1)]
counter = Counter(chunk_string(s,3))
#Counter({'AAC':2,'ACA':1,'CAA':1,'CTA':1,' TAA':1})
编辑:要详细说明 chunk_string
:
需要字符串 s
和一个块将 n
用作参数。每个 s [i:i + n]
是字符串的一部分,长度为 n
个字符。循环遍历可对字符串进行切片的有效索引( 0
到 len(s)-n
)。然后将所有这些分片按列表理解分组。等效的方法是:
def chunk_string(s,n):
个块= []
个last_index = len(s)-n在范围(0,last_index + 1)中的
:
chunks.append(s [i:i + n])
返回块
I've been trying to adapt my python function to count groups of letters instead of single letters and I'm having a bit of trouble. Here's the code I have to count individual letters:
my_seq = "CTAAAGTCAACCTTCGGTTGACCTTGAAAGGGCCTTGGGAACCTTCGGTTGACCTTGAGGGTTCCCTAAGGGTT"
def count_letters(str):
counts = {}
for c in str:
if c in counts:
counts[c]+=1
else:
counts[c]=1
return counts
counts = count_letters(my_seq)
print(counts)
The function currently spits out counts for each individual letter. Right now it prints this:
{'C': 23, 'T': 30, 'G': 30, 'A': 20}
Ideally, I'd like it to print something like this:
{'CTA': 2, 'TAG': 3, 'CGC': 1, 'GAG': 2 ... }
I'm very new to python and this is proving to be difficult.
This can be done pretty quickly using collections.Counter
.
from collections import Counter
s = "CTAACAAC"
def chunk_string(s, n):
return [s[i:i+n] for i in range(len(s)-n+1)]
counter = Counter(chunk_string(s, 3))
# Counter({'AAC': 2, 'ACA': 1, 'CAA': 1, 'CTA': 1, 'TAA': 1})
Edit: To elaborate on chunk_string
:
It takes a string s
and a chunk size n
as arguments. Each s[i:i+n]
is a slice of the string that is n
characters long. The loop iterates over the valid indices where the string can be sliced (0
to len(s)-n
). All of these slices are then grouped in a list comprehension. An equivalent method is:
def chunk_string(s, n):
chunks = []
last_index = len(s) - n
for i in range(0, last_index + 1):
chunks.append(s[i:i+n])
return chunks
这篇关于计算一个字符串中的多个字母组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!