本文介绍了QHash存储大量数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有10,000,000个类型为struct {int,int,int,int}的条目.当我使用 QHash QMap ,它占用大量内存,确实需要大约

I've 10,000,000 entry of type struct{int, int, int, int}. when I store them using QHash or QMap, it occupies large amount of memory, indeed it must take about

10,000,000 * 4 * 4 (sizeof integer) <= 153 MB

但是当我加载数据时,QHash和QMap都需要大约1.2 GB,为什么会发生这种情况以及如何针对速度和内存进行优化?(通过任何其他数据结构或qmap和qhash的一些技巧)

but when I load my data it takes about 1.2 GB for both QHash and QMap, why this occurs and how can I optimize it for both speed and memory?(through any other data structure or some tricks to qmap and qhash)

推荐答案

您在注释中说过,您将另外四个int用作键-这些值也必须保存,因此您实际上存储了8个int,而不是4.除此之外,QHash必须存储哈希值以有效地基于键查找值.哈希是一个无符号整数,因此您有9个值,每个值4个字节长.总计约350 MB.

You've said in the comment that you are using another four ints as key - these values also have to be saved, so you are actually storing 8 ints, not 4. Apart from that, QHash has to store the value of the hash to efficiently lookup the values based on the key. The hash is an unsigned integer, so you've got 9 values, each 4 bytes long. It sums up to ~350 MB.

此外,内部 QHash QMap 可能在其元素之间使用一些填充,例如满足数据结构对齐要求.填充是1字节的乘数,这意味着在1000万个元素的情况下,我们可能会至少获得几十个额外的兆字节.

Also, internally QHash or QMap may use some padding between its elements, for example to satisfy data structure alignment requirements. Padding is a multiplier of 1 byte, which means that in case of 10 mln elements we may get at least several dozens of additional megabytes.

此外, QHash QMap 不仅是原始数据-它们都使用其他指向其内部数据结构的指针等,这也是为什么单项输入的另一个原因将会占用比您预期更多的空间.

Besides, QHash and QMap are not just raw data - they both use additional pointers to their internal data structures etc., which is yet another reason why a single entry would take more space than you expected.

数据大小膨胀的另一个原因可能是由于效率原因,这些类可能存储一些附加值,以便在调用某些方法时可以对其进行预先计算.

Another source of swollen data size might be the fact that for efficiency reasons, these classes may store some additional values so that they are precomputed when you call some of their methods.

最后但并非最不重要的是,出于效率方面的考虑(避免不必要的复制), QHash 在任何给定时刻都保留了比其当前元素所需更多的内存.我希望大小越大,以防万一,因为复制会变得更昂贵,所以它将保留更多的内存.您可以通过调用 capacity()方法来预先检查保留的内存.如果要限制保留的内存量,请调用 squeeze()方法以调整内存,使其足以容纳当前存储的元素.

Last but not least, QHash reserves more memory than its current elements need in any given moment for efficiency reasons (avoiding unnecessary copying). I would expect that the greater the size, the more memory it would reserve just in case, because copying gets more expensive.You can check the memory reserved in advance by calling the capacity() method. If you want to limit the amount of memory reserved, call the squeeze() method to tailor the memory so that it is just enough to contain the currently stored elements.

这篇关于QHash存储大量数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-23 04:49