python - 从坏词列表创建检查功能

我正在尝试创建一个检查字符串中单词的函数。有点奇怪，这很有效。

这是我的代码：

def censor(sentence):
    badwords = 'apple orange banana'.split()
    sentence = sentence.split()

    for i in badwords:
        for words in sentence:
            if i in words:
                pos = sentence.index(words)
                sentence.remove(words)
                sentence.insert(pos, '*' * len(i))

    print " ".join(sentence)

sentence = "you are an appletini and apple. new sentence: an orange is a banana. orange test."

censor(sentence)

并输出：

you are an ***** and ***** new sentence: an ****** is a ****** ****** test.

一些标点符号消失了，单词"appletini"被错误地替换了。

如何解决？

另外，有没有更简单的方法来做这种事情？

最佳答案

具体问题是：

您根本不考虑标点符号。和
插入'*'时，请使用“坏词”而不是单词的长度。

我会改变循环顺序，因此您只处理一次句子，并使用enumerate而不是remove和insert：

def censor(sentence):
    badwords = ("test", "word") # consider making this an argument too
    sentence = sentence.split()

    for index, word in enumerate(sentence):
        if any(badword in word for badword in badwords):
            sentence[index] = "".join(['*' if c.isalpha() else c for c in word])

    return " ".join(sentence) # return rather than print

测试str.isalpha将仅用星号替换大写和小写字母。演示：

>>> censor("Censor these testing words, will you? Here's a test-case!")
"Censor these ******* *****, will you? Here's a ****-****!"
            # ^ note length                         ^ note punctuation