问题描述
我有一个关于后续问题picture.I使用 data.persist(StorageLevel.MEMORY_AND_DISK_SER)
缓存我们的原始数据,但什么是如此惊讶的是,我们在内存中缓存的速度是一样的,因为我们在磁盘缓存的速度?为什么?我觉得我们在内存中缓存应该比我们在磁盘缓存的速度更快的速度,谁可以帮我解决这个问题?
I have a question about the follow picture.I use data.persist(StorageLevel.MEMORY_AND_DISK_SER)to cache our original data,but what is so surprised is that the speed we cached in memory is the same as the speed we cached in disk?why?I feel the speed we cached in memory should be faster than the speed we cached in disk,who can help me with this problem?
推荐答案
如果我没看错,这是因为星火没有直接写入磁盘。
If I am not wrong, this is because Spark is not writing directly to disk.
有关 MEMORY_AND_DISK_SER
持久性水平,RDD,可以装入内存会离开那里(同MEMORY_ONLY),且仅当它太大了内存将它溢出到磁盘。
For MEMORY_AND_DISK_SER
persistence level, the RDD that could fit into memory would be left there (same as MEMORY_ONLY), and only if it was too big for memory would it spill to disk.
所以我没有问题,有presume,这是正常的,你会看到这些时间,直到你的内存已满,那么你将开始看到更长的时间来将数据写入到磁盘中。
So I presume you do not have problem there, it is normal that you will see these times, until your memory is full then you will start to see longer time to write the data to disk.
这篇关于星火缓存磁盘VS高速缓存存储器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!