hapax是仅在字符串中出现一次的单词。我的代码工作正常。首先,它得到了第一个hapax,然后,我更改了输入的字符串,然后得到了最后一个,第一个hapax,但没有第二个hapax……这是我当前的代码

def hapax(stringz):
    w = ''
    l = stringz.split()
    for x in l:
        w = ''
        l.remove(x)
        for y in l:
            w += y
        if w.find(x) == -1:
            print(x)


hapax('yo i went jogging then yo i went joggin tuesday wednesday')


我只有

then
wednesday

最佳答案

字符串模块:

使用字符串模块获取标点列表,并使用我们正常的for循环进行替换。

>>> import string
>>> string.punctuation
'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'
>>>


更多pythonic:
how to replace punctuation in a string python?



算法:


从“按字符串输入文本”模块中删除标点符号。
转换为小写。
拆分输入文本并更新字典。
迭代字典中的项目并更新hapax单词。


码:

import string
import collections

def hapax(text):
    # Remove Punctuation from the Input text.
    text = text.translate(string.maketrans("",""), string.punctuation)
    print "Debug 1- After remove Punctuation:", text

    # ignore:- Lower/upper/mix cases
    text = text.lower()
    print "Debug 2- After converted to Lower case:", text

    #- Create Default dictionary. Key is word and value
    word_count = collections.defaultdict(int)
    print "Debug 3- Collection Default Dictionary:", word_count

    #- Split text and update result dictionary.
    for word in text.split():
        if word:#- Ignore whitespace.
            word_count[word] += 1

    print "Debug 4- Word and its count:", word_count

    #- List which save word which value is 1.
    hapax_words = list()
    for word, value in word_count.items():
        if value==1:
            hapax_words.append(word)

    print "Debug 5- Final Hapax words:", hapax_words


hapax('yo i went jogging then yo i went jogging tuesday wednesday some punctuation ? I and & ')


输出:

$ python 2.py
Debug 1- After remove Punctuation: yo i went jogging then yo i went jogging tuesday wednesday some punctuation  I and
Debug 2- After converted to Lower case: yo i went jogging then yo i went jogging tuesday wednesday some punctuation  i and
Debug 3- Collection Default Dictionary: defaultdict(<type 'int'>, {})
Debug 4- Word and its count: defaultdict(<type 'int'>, {'and': 1, 'then': 1, 'yo': 2, 'i': 3, 'tuesday': 1, 'punctuation': 1, 'some': 1, 'wednesday': 1, 'jogging': 2, 'went': 2})
Debug 5- Final Hapax words: ['and', 'then', 'tuesday', 'punctuation', 'some', 'wednesday']

10-07 18:24