当我在 Python 中处理 HTML 代码时,由于特殊字符,我必须使用以下代码。

line = string.replace(line, """, "\"")
line = string.replace(line, "'", "'")
line = string.replace(line, "&", "&")
line = string.replace(line, "&lt;", "<")
line = string.replace(line, "&gt;", ">")
line = string.replace(line, "&laquo;", "<<")
line = string.replace(line, "&raquo;", ">>")
line = string.replace(line, "&#039;", "'")
line = string.replace(line, "&#8220;", "\"")
line = string.replace(line, "&#8221;", "\"")
line = string.replace(line, "&#8216;", "\'")
line = string.replace(line, "&#8217;", "\'")
line = string.replace(line, "&#9632;", "")
line = string.replace(line, "&#8226;", "-")

看来我必须替换更多这样的特殊字符。你知道如何让这段代码更优雅吗?

谢谢你

最佳答案

REPLACEMENTS = [
    ("&quot;", "\""),
    ("&apos;", "'"),
    ...
    ]
for entity, replacement in REPLACEMENTS:
    line = line.replace(entity, replacement)

请注意,string.replace 仅可用作 str/unicode 对象上的方法。

更好的是,查看 this question !

不过,您的问题的标题提出了一些不同的问题:优化,即使其运行得更快。这是一个完全不同的问题,需要更多的工作。

关于python - 使 string.replace 语句序列更具可读性,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/6889172/

10-11 22:36
查看更多