问题描述
我了解到,在某些不可变的类中, __ new __
可能会返回一个现有实例-这就是 int
的含义, str
和 tuple
类型有时可用于较小的值。
I learnt that in some immutable classes, __new__
may return an existing instance - this is what the int
, str
and tuple
types sometimes do for small values.
但是以下两个代码段的行为为何不同?
But why do the following two snippets differ in the behavior?
结尾处有空格:
>>> a = 'string '
>>> b = 'string '
>>> a is b
False
无空格:
>>> c = 'string'
>>> d = 'string'
>>> c is d
True
为什么空间会带来差异?
Why does the space bring the difference?
推荐答案
这是CPython实现如何选择缓存字符串文字的一个古怪之处。具有相同内容的字符串文字可以引用相同的字符串对象,但不必如此。当'string'
不是因为'string时,
仅包含Python标识符中允许的字符。我不知道为什么为什么这是他们选择的标准,但事实并非如此。在不同的Python版本或实现中,行为可能有所不同。'string'
会被自动禁闭'
This is a quirk of how the CPython implementation chooses to cache string literals. String literals with the same contents may refer to the same string object, but they don't have to. 'string'
happens to be automatically interned when 'string '
isn't because 'string'
contains only characters allowed in a Python identifier. I have no idea why that's the criterion they chose, but it is. The behavior may be different in different Python versions or implementations.
从CPython 2.7源代码中,,第28行:
From the CPython 2.7 source code, stringobject.h
, line 28:
您可以在:
/* Intern selected string constants */
for (i = PyTuple_Size(consts); --i >= 0; ) {
PyObject *v = PyTuple_GetItem(consts, i);
if (!PyString_Check(v))
continue;
if (!all_name_chars((unsigned char *)PyString_AS_STRING(v)))
continue;
PyString_InternInPlace(&PyTuple_GET_ITEM(consts, i));
}
此外,请注意,实习是独立于字符串文字合并的过程Python字节码编译器。如果让编译器一起编译 a
和 b
赋值,例如通过将它们放入模块或如果为True:
,您会发现 a
和 b
是相同的字符串。
Also, note that interning is a separate process from the merging of string literals by the Python bytecode compiler. If you let the compiler compile the a
and b
assignments together, e.g. by placing them in a module or an if True:
, you would find that a
and b
would be the same string.
这篇关于带有空格且末尾没有空格且不可变的Python字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!