问题描述
我的训练量确实很大.整个过程占用约120GB的RAM,因此我什至无法生成numpy.zeros()数组来存储数据.
My training set is really quite large. The entire thing takes up about 120GB of RAM and so I can't even generate the numpy.zeros() array to store the data.
据我所见,当整个数据集已加载到数组中,然后以增量方式馈入网络,然后再删除时,使用生成器效果很好.
From what I've seen, using a generator works well when the entire dataset is already loaded into an array but then is incrementally fed into the network and then deleted afterwards.
生成器可以创建数组,插入数据,将数据加载到网络中,删除数据吗?还是整个过程会花费太长时间,而我应该做其他事情?
Is it alright for the generator to create the arrays, insert the data, load the data into the network, delete the data? Or will that whole process take too long and I should be doing something else?
谢谢
推荐答案
您不需要一次加载全部数据,可以根据批处理需要加载任意数量的数据.查看此答案.
You do not need to load the whole data at once, you can load just as much as your batch needs. Check out this answer.
这篇关于生成数据以在keras中进行训练的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!