问题描述
我从网络中提取数据,并希望将其与终端窗口中的表格对齐。我可以在大多数情况下对齐文本罚款,但当文本包含某些符号或外来字符的东西变得凌乱。如何处理这些字符?下面是第三行输出问题的示例:
I am pulling data from the web and want to align it in a table in a terminal window. I can align the text fine in most cases but when the text contains certain symbols or foreign characters things get messy. How can I handle these characters? Here is an example with the problem on the third line of output:
>>> items = "Apple tree", "Banana plant", "Orange 으르", "Goodbye"
>>> values = 100, 200, 300, 400
>>> for i, v in zip(items, values):
... print "%-15s : %-4s" % (i, v)
...
Apple tree : 100
Banana plant : 200
Orange 으르 : 300
Goodbye : 400
>>>
注意:我正确引用了所有项目。 Orange
结束引号在Stack Overflow上不正确显示,但在终端窗口中显示正确。
Note: I quoted all the items correctly. The "Orange"
closing quotes don't show correctly here on Stack Overflow but they display fine in the terminal window.
UPDATE:我已为此问题添加了奖励。我正在寻找一个解决方案,可以实现没有太多的额外的代码,而不使用外部库。它也应该与python 2.7+和3.x(条件测试的版本和应用不同的修补程序会很好)。此外,它不需要任何额外的系统配置或更改字体或更改标准Debian / Ubuntu安装的任何终端设置。
UPDATE: I have added a bounty to this question. I am looking for a solution that can be implemented without too much additional code and without using external libraries. It should also work with python 2.7+ and 3.x (conditionals that test for versions and apply different fixes would be fine). Also it should not require any additional system configuration or changing of fonts or changing any terminal settings of a standard Debian/Ubuntu installation.
推荐答案
这些特定字符的特殊行为可以使用属性从其Unicode数据。根据以程式化的方式提出建议如果Unicode字符在终端中占用多个字符空间,并使用该值进行对齐:
The special behaviour for those particular characters can be identified using the East Asian width property from their Unicode data. Taking the suggestion from Programmatically tell if a Unicode character takes up more than one character space in a terminal and using that value for alignment:
#!/usr/bin/python3
import unicodedata
items = "Apple tree", "Banana plant", "Orange 으르", "Goodbye"
values = 100, 200, 300, 400
for i, v in zip(items, values):
eawid = len(i) + sum(1 for v in i if unicodedata.east_asian_width(v) == 'W')
pad = ' ' * (15 - eawid)
print("%s%s : %-4s" % (i, pad, v))
提供:
Apple tree : 100
Banana plant : 200
Orange 으르 : 300
Goodbye : 400
如果您的浏览器对这些字符使用1.5宽度字形,则可能会出现未对齐的情况;在我的终端, plan
的宽度与으르
完全相同。
This may appear misaligned if your browser is using a 1.5-width glyph for those characters; in my terminal, plan
is exactly the same width as 으르
.
这里的语法是Python 3,但是相同的技术在2.7中起作用。
Syntax here is Python 3, but the same technique works in 2.7.
这篇关于使用默认等宽字体在终端窗口中对齐Unicode文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!