问题描述
我有一串HTML存储在数据库中。不幸的是它包含一些字符,如我想用这个HTML等价物替换这些字符,无论是在DB本身还是在Python / Django代码中使用Find Replace。
$ b $有关如何做到这一点的任何建议?
您可以使用ASCII字符是前128个字符,因此,使用 ord
,如果超出范围,请将其删除
# - * - 编码:utf-
def strip_non_ascii(string):
'''返回没有非ASCII字符的字符串'''
stripped =(c for string in string if 0< ; ord(c)< 127)
return''.join(stripped)
test =u'éáé123456tgreáé@ $'
print test
print strip_non_ascii(test)
结果
éáé123456tgreáé@€
123456tgre @
请注意包含 @
,因为毕竟它是一个ASCII字符。如果您要剥离特定子集(如数字和大小写字母),则可以限制范围,查看
编辑:再次阅读您的问题后,也许您需要转载您的HTML代码,因此所有这些字符在呈现后都会正确显示。您可以在模板上使用 escape
过滤器。
I have a string of HTML stored in a database. Unfortunately it contains characters such as ®I want to replace these characters by their HTML equivalent, either in the DB itself or using a Find Replace in my Python / Django code.
Any suggestions on how I can do this?
You can use that the ASCII characters are the first 128 ones, so get the number of each character with ord
and strip it if it's out of range
# -*- coding: utf-8 -*-
def strip_non_ascii(string):
''' Returns the string without non ASCII characters'''
stripped = (c for c in string if 0 < ord(c) < 127)
return ''.join(stripped)
test = u'éáé123456tgreáé@€'
print test
print strip_non_ascii(test)
Result
éáé123456tgreáé@€
123456tgre@
Please note that @
is included because, well, after all it's an ASCII character. If you want to strip a particular subset (like just numbers and uppercase and lowercase letters), you can limit the range looking at a ASCII table
EDITED: After reading your question again, maybe you need to escape your HTML code, so all those characters appears correctly once rendered. You can use the escape
filter on your templates.
这篇关于使用python / django从字符串中删除非ASCII字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!