我已经在Python中运行了以下代码,以从文本文件生成单词列表及其数量。我该如何从“ frequency_list”变量中滤除仅计数为1的单词?
另外,我如何将底部的打印语句循环导出到CSV
在此先感谢您提供的任何帮助。
import re
import string
frequency = {}
document_text = open('Words.txt', 'r')
text_string = document_text.read().lower()
match_pattern = re.findall(r'\b[a-z]{3,15}\b', text_string)
for word in match_pattern:
count = frequency.get(word,0)
frequency[word] = count + 1
frequency_list = frequency.keys()
for words in frequency_list:
print (words, frequency[words])
最佳答案
要过滤掉单词,另一种方法是:
frequency = dict(filter(lambda (k,v): v>1, frequency.items()))
要将底部的打印语句循环导出为CSV,可以执行以下操作:
import csv
frequency_list = ['word1','word2','word3'] # example
with open('output.csv','w') as csvfile:
writer = csv.writer(csvfile, delimiter=",")
writer.writerow(frequency_list)
这将生成一个“ output.csv”文件,其中有您的frequency_list中的单词在一行中。
要获得每个单词的行,请尝试以下操作:
with open('output.csv','w') as csvfile:
writer = csv.writer(csvfile, delimiter=",")
writer.writerows([i.strip() for i in l.split(',')] for l in frequency_list)
更新资料
要获得带有计数器的csv,请保留您的初始字典并执行以下操作:
frequency = {"one":1,"two":2,"three":3} #example
with open('output.csv', 'w') as csvfile:
writer = csv.writer(csvfile)
for key, value in frequency.items():
writer.writerow([key, value])
关于python - 在Python中更改字典/键,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/39692480/