python - 在python 2.7中打印阿拉伯/波斯字母

这个问题已经有了答案：
Why do I get the u“xyz” format when I print a list of unicode strings in Python?
3个答案
在下面的代码中，python似乎没有使用阿拉伯字母。有什么想法吗？

#!/usr/bin/python
# -*- coding: utf-8 -*-

import nltk
sentence = "ورود ممنوع"

tokens = nltk.word_tokenize(sentence)

print tokens

结果是：

>>>
['\xd9\x88\xd8\xb1\xd9\x88\xd8\xaf', '\xd9\x85\xd9\x85\xd9\x86\xd9\x88\xd8\xb9']
>>>

我还尝试在字符串前添加一个u，但没有帮助：

>>> u"ورود ممنوع">>>
['\xd9\x88\xd8\xb1\xd9\x88\xd8\xaf', '\xd9\x85\xd9\x85\xd9\x86\xd9\x88\xd8\xb9']

最佳答案

在包含字节字符串的列表中有正确的结果：

>>> lst = ['\xd9\x88\xd8\xb1\xd9\x88\xd8\xaf',
           '\xd9\x85\xd9\x85\xd9\x86\xd9\x88\xd8\xb9']
>>> for l in lst:
...  print l
...
ورود
ممنوع

要将其转换为Unicode，可以使用列表压缩：

>>> lst = [e.decode('utf-8') for e in lst]
>>> lst
[u'\u0648\u0631\u0648\u062f', u'\u0645\u0645\u0646\u0648\u0639']