hapax是仅在字符串中出现一次的单词。我的代码工作正常。首先,它得到了第一个hapax,然后,我更改了输入的字符串,然后得到了最后一个,第一个hapax,但没有第二个hapax……这是我当前的代码
def hapax(stringz):
w = ''
l = stringz.split()
for x in l:
w = ''
l.remove(x)
for y in l:
w += y
if w.find(x) == -1:
print(x)
hapax('yo i went jogging then yo i went joggin tuesday wednesday')
我只有
then
wednesday
最佳答案
字符串模块:
使用字符串模块获取标点列表,并使用我们正常的for循环进行替换。
>>> import string
>>> string.punctuation
'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'
>>>
更多pythonic:
how to replace punctuation in a string python?
算法:
从“按字符串输入文本”模块中删除标点符号。
转换为小写。
拆分输入文本并更新字典。
迭代字典中的项目并更新hapax单词。
码:
import string
import collections
def hapax(text):
# Remove Punctuation from the Input text.
text = text.translate(string.maketrans("",""), string.punctuation)
print "Debug 1- After remove Punctuation:", text
# ignore:- Lower/upper/mix cases
text = text.lower()
print "Debug 2- After converted to Lower case:", text
#- Create Default dictionary. Key is word and value
word_count = collections.defaultdict(int)
print "Debug 3- Collection Default Dictionary:", word_count
#- Split text and update result dictionary.
for word in text.split():
if word:#- Ignore whitespace.
word_count[word] += 1
print "Debug 4- Word and its count:", word_count
#- List which save word which value is 1.
hapax_words = list()
for word, value in word_count.items():
if value==1:
hapax_words.append(word)
print "Debug 5- Final Hapax words:", hapax_words
hapax('yo i went jogging then yo i went jogging tuesday wednesday some punctuation ? I and & ')
输出:
$ python 2.py
Debug 1- After remove Punctuation: yo i went jogging then yo i went jogging tuesday wednesday some punctuation I and
Debug 2- After converted to Lower case: yo i went jogging then yo i went jogging tuesday wednesday some punctuation i and
Debug 3- Collection Default Dictionary: defaultdict(<type 'int'>, {})
Debug 4- Word and its count: defaultdict(<type 'int'>, {'and': 1, 'then': 1, 'yo': 2, 'i': 3, 'tuesday': 1, 'punctuation': 1, 'some': 1, 'wednesday': 1, 'jogging': 2, 'went': 2})
Debug 5- Final Hapax words: ['and', 'then', 'tuesday', 'punctuation', 'some', 'wednesday']