本文介绍了Python 如何处理大于 64 位无符号整数限制的数字?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从这个问题 (一个 64 位有符号整数可以有多大?),我了解到在 64 位机器上可以使用的最大数字是 2^64-1,即 92,233,720,368,547,758,070.这意味着,即使我向其中添加 1,它也应该返回 inf.但它没有显示 inf.这是我观察到的:

From this question (How big can a 64bit signed integer be?), I learned that the biggest possible number to work with on a 64-bit machine is 2^64-1, which is 92,233,720,368,547,758,070. That means, even if I add 1 to it, it should return inf. But it's not showing inf. Here's what I'm observing:

>>> max = sys.maxsize
>>> format(max, ',')
'9,223,372,036,854,775,807'
>>> a = max * 10
>>> format(a, ',')
'92,233,720,368,547,758,070'
>>> a / max
10.0

即使由于某些原因92,233,720,368,547,758,070 不是 Python 的最大数字,那么 sys.maxsize 有什么用?

Even if for some reason 92,233,720,368,547,758,070 is not the biggest number for Python, then what is the use of sys.maxsize?

其次,64 位数字不应该占用 64 位内存空间吗?为什么 maxa 都占用 36 字节?

Secondly, shouldn't a 64-bit number take 64-bit memory space? Why both max and a are taking 36 bytes?

>>> sys.getsizeof(max)
36
>>> sys.getsizeof(a)
36

谁能描述一下这两种混淆?

Can anyone please describe both of the confusion?

推荐答案

Integers as Digit-Arrays

Python 3 (CPython) 整数不是本地机器整数.从逻辑上讲,每个整数都由它的 符号 和一个以 1073741824(30 位)或 32768(15 位)[*] 为基数的绝对数组成——后者是一个可变大小的无符号整数数组.为了存储更大的数字,需要一个额外的数字".被添加到数组中.

Integers as Digit-Arrays

Python 3 (CPython) integers are not native machine integers. Logically, each integer consists of its sign and an absolute number in base 1073741824 (30-bit) or 32768 (15-bit) [*] - the latter being a variable-size array of unsigned integers. To store larger numbers, an additional "digit" is added to the array.

>>> sys.getsizeof(0)          # largest  0-digit number
24
>>> sys.getsizeof(1)          # smallest 1-digit number
28
>>> sys.getsizeof(2**30 - 1)  # largest  1-digit number
28
>>> sys.getsizeof(2**30)      # smallest 2-digit number
32
>>> sys.getsizeof(2**60 - 1)  # largest  2-digit number
32

粗略地说,这与写十进制数时添加数字的机制相同——使用 1 位到 9,2 位到 99,依此类推.同样,只要计算机具有添加数字"的内存,可以定义更大的 Python 整数.

Loosely speaking, this is the same mechanism as one adds digits when writing out decimal numbers – using 1-digit is enough up to 9, 2-digit up to 99, and so on. Likewise, as long as the computer has memory to "add a digit" a Python integer of larger size can be defined.

[*] 数字是30/15 位而不是 32 位/16 位,因为这更适合某些算法. 例如,long_pow() 需要一个可被 5 整除的大小.

[*] The digits are 30-bit/15-bit instead of 32-bit/16-bit because this better fits some algorithms. For example, long_pow() requires a size divisible by 5.

实际上,整数也是对象——这意味着它们保存了元数据,例如类型和引用计数——这些元数据也会占用空间.在 CPython 中,int 包含 的:

Practically, integers are also objects – meaning they hold metadata such as type and reference count - which also takes up space. In CPython, an int consists of:

  • Py_ssize_t 的引用计数器
  • 指向 PyTypeObject*
  • 类型的指针
  • Py_ssize_t
  • digit[]
  • 的可变数字数组
  • reference counter of Py_ssize_t
  • pointer to the type of PyTypeObject*
  • digit count of Py_ssize_t
  • variable array of digits of digit[]

其中前三个是每个的结构可变大小的对象.符号在数字计数内编码.

where the first three are the structure of every variable size object. The sign is encoded inside the digit count.

在 64 位机器上,Py_ssize_tPyTypeObject* 都是 8 字节大小——给出0 位整数"0 大小为 3*8 字节或 24 字节.

On a 64-bit machine, both Py_ssize_t and PyTypeObject* are 8 byte in size – giving the "0-digit integer" 0 a size of 3*8 bytes or 24 bytes.

>>> sys.getsizeof(0)          # largest  0-digit number
24

那么什么是sys.maxsize?

sys.maxsize的意思不是最大整数大小,而是最大容器大小:

So what is sys.maxsize?

The meaning of sys.maxsize is not the maximum integer size, but the maximum container size:

>>> len(range(sys.maxsize))    # this is fine
9223372036854775807
>>> len(range(sys.maxsize+1))  # this is one too much
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: range() result has too many items

这是sys.maxsize 表示 Py_ssize_t 的最大值,CPython 运行时用来表示和寻址内存的类型.虽然这看起来像是一个任意限制,但它实际上远远超过计算机可以解决的问题.

这篇关于Python 如何处理大于 64 位无符号整数限制的数字?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-24 08:27