本文介绍了计算文本文件中字母的频率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
在python中,如何遍历文本文件并计算每个字母出现的次数?我意识到我可以只使用for x in file"语句来完成它,然后设置 26 个左右的 if elif 语句,但肯定有更好的方法吗?
谢谢.
解决方案
from collections import Counter使用 open(file) 作为 f:c = 计数器()对于 f 中的行:c += 计数器(行)
如果文件不是那么大,可以将其作为字符串全部读入内存,并在一行代码中将其转换为Counter
对象:
c = Counter(f.read())
示例:
>>>c = 计数器()>>>c += Counter('aaabbbcccddd eee fff ggg')>>>C计数器({'a':3,'':3,'c':3,'b':3,'e':3,'d':3,'g':3,'f':3})>>>c += Counter('aaabbbccc')计数器({'a':6,'c':6,'b':6,'':3,'e':3,'d':3,'g':3,'f':3})或使用 count()
字符串的方法:
from string import ascii_lowercase # ascii_lowercase =='abcdefghijklmnopqrstuvwxyz'使用 open(file) 作为 f:文本 = f.read().strip()dic = {}对于 ascii_lowercase 中的 x:dic[x] = text.count(x)
In python, how can I iterate through a text file and count the number of occurrences of each letter? I realise I could just use a 'for x in file' statement to go through it and then set up 26 or so if elif statements, but surely there is a better way to do it?
Thanks.
解决方案
from collections import Counter
with open(file) as f:
c = Counter()
for line in f:
c += Counter(line)
If the file is not so large, you can read all of it into memory as a string and convert it into a Counter
object in one line of code:
c = Counter(f.read())
Example:
>>> c = Counter()
>>> c += Counter('aaabbbcccddd eee fff ggg')
>>> c
Counter({'a': 3, ' ': 3, 'c': 3, 'b': 3, 'e': 3, 'd': 3, 'g': 3, 'f': 3})
>>> c += Counter('aaabbbccc')
Counter({'a': 6, 'c': 6, 'b': 6, ' ': 3, 'e': 3, 'd': 3, 'g': 3, 'f': 3})
or use the count()
method of strings:
from string import ascii_lowercase # ascii_lowercase =='abcdefghijklmnopqrstuvwxyz'
with open(file) as f:
text = f.read().strip()
dic = {}
for x in ascii_lowercase:
dic[x] = text.count(x)
这篇关于计算文本文件中字母的频率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!