问题描述
我正在尝试使用 Python 从字符串中删除特定字符.这是我现在正在使用的代码.不幸的是,它似乎对字符串没有任何作用.
I'm trying to remove specific characters from a string using Python. This is the code I'm using right now. Unfortunately it appears to do nothing to the string.
for char in line:
if char in " ?.!/;:":
line.replace(char,'')
我该如何正确执行此操作?
How do I do this properly?
推荐答案
Python 中的字符串不可变(无法更改).因此,line.replace(...)
的作用只是创建一个新字符串,而不是更改旧字符串.您需要重新绑定(分配)到 line
以便让该变量采用新值,并删除这些字符.
Strings in Python are immutable (can't be changed). Because of this, the effect of line.replace(...)
is just to create a new string, rather than changing the old one. You need to rebind (assign) it to line
in order to have that variable take the new value, with those characters removed.
此外,相对而言,您的操作方式会比较缓慢.对于有经验的 Pythonators 来说,这也可能会有点混乱,他们会看到一个双重嵌套的结构,并会认为有更复杂的事情正在发生.
Also, the way you are doing it is going to be kind of slow, relatively. It's also likely to be a bit confusing to experienced pythonators, who will see a doubly-nested structure and think for a moment that something more complicated is going on.
从 Python 2.6 和更新的 Python 2.x 版本开始*,您可以改为使用 str.translate
,(请参阅下面的 Python 3 答案):
Starting in Python 2.6 and newer Python 2.x versions *, you can instead use str.translate
, (see Python 3 answer below):
line = line.translate(None, '!@#$')
或正则表达式替换为 re.sub
or regular expression replacement with re.sub
import re
line = re.sub('[!@#$]', '', line)
括号内的字符构成一个字符类.line
中属于该类的任何字符都将替换为 sub
的第二个参数:一个空字符串.
The characters enclosed in brackets constitute a character class. Any characters in line
which are in that class are replaced with the second parameter to sub
: an empty string.
在 Python 3 中,字符串是 Unicode.您将不得不以稍微不同的方式翻译.kevpie 在一个评论中提到了这一点的答案,并在 文档中注明了 str.translate
.
In Python 3, strings are Unicode. You'll have to translate a little differently. kevpie mentions this in a comment on one of the answers, and it's noted in the documentation for str.translate
.
在调用 Unicode 字符串的 translate
方法时,不能传递我们上面使用的第二个参数.您也不能将 None
作为第一个参数传递.相反,您将转换表(通常是字典)作为唯一参数传递.此表映射字符的序数值(即调用ord
到应该替换它们的字符的序数值,或者——对我们有用——None
表示它们应该被删除.
When calling the translate
method of a Unicode string, you cannot pass the second parameter that we used above. You also can't pass None
as the first parameter. Instead, you pass a translation table (usually a dictionary) as the only parameter. This table maps the ordinal values of characters (i.e. the result of calling ord
on them) to the ordinal values of the characters which should replace them, or—usefully to us—None
to indicate that they should be deleted.
所以要使用 Unicode 字符串进行上述舞蹈,您可以调用类似的内容
So to do the above dance with a Unicode string you would call something like
translation_table = dict.fromkeys(map(ord, '!@#$'), None)
unicode_line = unicode_line.translate(translation_table)
这里 dict.fromkeys
和 map
用于简洁生成包含
Here dict.fromkeys
and map
are used to succinctly generate a dictionary containing
{ord('!'): None, ord('@'): None, ...}
更简单,正如另一个答案所说的,就地创建转换表:
Even simpler, as another answer puts it, create the translation table in place:
unicode_line = unicode_line.translate({ord(c): None for c in '!@#$'})
或者,按照 Joseph Lee 的建议,使用 str.maketrans
:
Or, as brought up by Joseph Lee, create the same translation table with str.maketrans
:
unicode_line = unicode_line.translate(str.maketrans('', '', '!@#$'))
* 为了与早期的 Python 兼容,您可以创建一个null"代替 None
传递的转换表:
import string
line = line.translate(string.maketrans('', ''), '!@#$')
这里 string.maketrans
用于创建一个翻译表,它只是一个字符串,其中包含序数值为 0 到 255 的字符.
Here string.maketrans
is used to create a translation table, which is just a string containing the characters with ordinal values 0 to 255.
这篇关于从 Python 中的字符串中删除特定字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!