问题描述
我有一个7列的表,其中5列将为空。我将在 int
,文本
,日期
, boolean
和 money
数据类型。此表将包含数百万行,其中包含许多空值。我担心空值将占用空间。 另外,你知道Postgres是否索引null值?我想阻止它索引null。
基本上, NULL
值在NULL位图中占据 1位。但是它不是那么简单
空位图(每行)只有在该行中至少有一列持有 NULL
值。这可能会导致具有9个或更多列的表中的矛盾效应:将第一个 NULL
值分配给列可能会占用磁盘上的空间,而不是向其写入一个值。相反,当最后一列变为非空时,该行的空位图被丢弃。
物理上,初始空位图占据了 1个字节 HeapTupleHeader
(23个字节)和实际列数据或行 OID
(如果您应该使用) - 始终以现金服务器的 MAXALIGN
(通常为 8字节)的倍数开始。这将留下初始空位图使用的填充的1字节。
有效 NULL存储对于8列或更少。
之后,另外4/8字节(通常为8)分配给下一个32/64列。等等。
更多细节
一旦你了解填充数据类型,可以进一步优化存储。更多。但是,您可以节省大量空间的情况很少见。通常这不值得努力。
已经涵盖对索引大小的影响。
I have a table with 7 columns and 5 of them will be null. I will have a null columns on int
, text
, date
, boolean
, and money
data types. This table will contain millions of rows with many many nulls. I am afraid that the null values will occupy space.
Also, do you know if Postgres indexes null values? I would like to prevent it from indexing nulls.
Basically, NULL
values occupy 1 bit in the NULL bitmap. But it is not as simple as that
The null bitmap (per row) is only there if at least one column in that row holds a NULL
value. This can lead to a paradox effect in tables with 9 or more columns: assigning the first NULL
value to a column can take up more space on disk than writing a value to it. Conversely, with the last column becoming non-null, the null bitmap is dropped for the row.
Physically, the initial null bitmap occupies 1 byte between the HeapTupleHeader
(23 bytes) and actual column data or the row OID
(if you should be using that) - which always start at a multiple of MAXALIGN
(typically 8 byte on modern-day servers). That leaves 1 byte of padding that is utilized by the initial null bitmap.
In effect NULL storage is absolutely free for tables of 8 columns or less.
After that, another 4 / 8 bytes (typically 8) are allocated for the next 32 / 64 columns. Etc.
More details in the manual and under these related questions:
- How much disk-space is needed to store a NULL value using postgresql DB?
- Does not using NULL in PostgreSQL still use a NULL bitmap in the header?
- How many records can I store in 5 MB of PostgreSQL on Heroku?
Once you understand padding of data types, you can further optimize storage. More in this related answer. But the cases are rare where you can save substantial amounts of space. Normally it's not worth the effort.
@Daniel already covers effects on index size.
这篇关于可空列可以在PostgreSQL中占用额外的空间吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!