问题描述
我已经在各种论坛上看到了如下所示的名为Zalgo的奇怪格式文本。这看起来有些恼人,但它真的让我困扰,因为它破坏了我对角色应该是什么的概念。我的理解是,角色应该横过一条线并保持在某个容器内。显然Zalgo文本是垂直移动的,似乎并不局限于任何空间。
这是Unicode中的一个缺陷/漏洞/攻击/黑客攻击吗?这些具有怪异属性的个人角色? 什么? 这里发生了什么?
文本使用组合字符,也称为组合标记。请参阅 - H
ͭ
&安培;#x343;
̓
̇
I've seen weirdly formatted text called Zalgo like below written on various forums. It's kind of annoying to look at, but it really bothers me because it undermines my notion of what a character is supposed to be. My understanding is that a character is supposed to move horizontally across a line and stay within a certain "container". Obviously the Zalgo text is moving vertically and doesn't seem to be restricted to any space.
Is this a bug/flaw/exploit/hack in Unicode? Are these individual characters with weird properties? "What" is happening here?
The text uses combining characters, also known as combining marks. See section 2.11 of Combining Characters in the Unicode Standard (PDF).
In Unicode, character rendering does not use a simple character cell model where each glyph fits into a box with given height. Combining marks may be rendered above, below, or inside a base character
So you can easily construct a character sequence, consisting of a base character and "combining above" marks, of any length, to reach any desired visual height, assuming that the rendering software conforms to the Unicode rendering model. Such a sequence has no meaning of course, and even a monkey could produce it (e.g., given a keyboard with suitable driver).
And you can mix "combining above" and "combining below" marks.
The sample text in the question starts with:
- LATIN CAPITAL LETTER H -
H
- COMBINING LATIN SMALL LETTER T -
ͭ
- COMBINING GREEK KORONIS -
̓
- COMBINING COMMA ABOVE -
̓
- COMBINING DOT ABOVE -
̇
这篇关于Zalgo文本如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!