本文介绍了可以在python 3和2中使用的Unicode文字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我有一个python脚本,为了方便起见,我更喜欢在python 3.2和2.7上工作.

So I have a python script that I'd prefer worked on python 3.2 and 2.7 just for convenience.

有没有办法使Unicode文字在这两种方式中都能使用?例如

Is there a way to have unicode literals that work in both? E.g.

#coding: utf-8
whatever = 'שלום'

上面的代码在python 2.x(u'')和python 3.x中需要一个unicode字符串,很少有u会导致语法错误.

The above code would require a unicode string in python 2.x (u'') and in python 3.x that little u causes a syntax error.

推荐答案

编辑-从Python 3.3开始,u''文字再次起作用,因此不需要u()函数.

Edit - Since Python 3.3, the u'' literal works again, so the u() function isn't needed.

最好的选择是创建一个方法,该方法可以从Python 2中的字符串对象创建unicode对象,但是将字符串对象保留在Python 3中(因为它们已经是unicode)了.

The best option is to make a method that creates unicode objects from string objects in Python 2, but leaves the string objects alone in Python 3 (as they are already unicode).

import sys
if sys.version < '3':
    import codecs
    def u(x):
        return codecs.unicode_escape_decode(x)[0]
else:
    def u(x):
        return x

然后您将像这样使用它:

You would then use it like so:

>>> print(u('\u00dcnic\u00f6de'))
Ünicöde
>>> print(u('\xdcnic\N{Latin Small Letter O with diaeresis}de'))
Ünicöde

这篇关于可以在python 3和2中使用的Unicode文字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-31 02:46