LMDB文件以及它们如何用于caffe深度学习网络

本文介绍了LMDB文件以及它们如何用于caffe深度学习网络的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在深度学习中很新，我在使用caffe深度学习网络时遇到一些问题。基本上，我没有找到任何文件解释如何解决我现在正在处理的一系列问题和问题。

I am quite new in deep learning and I am having some problems in using the caffe deep learning network. Basically, I didn't find any documentation explaining how I can solve a series of questions and problems I am dealing right now.

请让我先解释我的情况。

Please, let me explain my situation first.

我有数千张图片，我必须对它们进行一系列预处理操作。对于每个预处理操作，我必须将这些预处理的图像保存为4D矩阵，并且还存储具有图像标签的向量。我将这个信息存储为LMDB文件，将用作caffe googlene深度学习的输入。

I have thousands of images and I must do a series of pre-processing operations on them. For each pre-processing operation, I have to save these pre-processed images as 4D matrices and also store a vector with the images labels. I will store this information as LMDB files that will be used as input for the caffe googlenet deep learning.

我试图将我的图片保存为.HD5文件，但最终的文件大小是80GB，这是不可能处理我的内存。

I tried to save my images as .HD5 files, but the final file size is 80GB, which is impossible to process with the memory I have.

所以，另一个选项是使用LMDB文件，对吧？我是这个文件格式的新手，我感谢您的帮助，了解如何在Matlab中创建它们。基本上，我的菜鸟问题是：

So, the other option is using LMDB files, right? I am quite newbie in this file format and I appreciate your help in understanding how to create them in Matlab. Basically, my rookie questions are:

1-这些LMDB文件扩展名为.MDB，对不对？是这个扩展同样使用微软访问？或正确的格式是.lmdb，他们是不同的？

1- These LMDB files have extension .MDB, right? is this extension the same used by microsoft access? or the right format is .lmdb and they are different?

2-我发现这个解决方案用于创建.mdb文件（），它是否创建caffe需要的文件格式？

2- I find this solution for creating .mdb files (https://github.com/kyamagu/matlab-leveldb), does it create the file format needed by caffe?

3-对于caffe ，我应该创建一个.mdb文件的标签和其他图像或两者都可以是相同的.mdb文件的字段？

3- For caffe, should I have to create one .mdb file for labels and other for images or both can be fields of the same .mdb file?

4-当我创建一个。 mdb文件我必须标记数据库字段。我可以将一个字段标记为图像，将其他标签标记为？是什么意思？

4- When I create an .mdb file I have to label the database fields. Can I label one field as image and other as label? does caffe understand which field means?

5-什么是函数（在）database.put（'key1'，'value1'）和database.put（'key2'，'value2'）do？我应该把我的4-d矩阵保存在一个字段，标签向量在另一个？

5- what does the function (in https://github.com/kyamagu/matlab-leveldb) database.put('key1', 'value1') and database.put('key2', 'value2') do? Should I have to save my 4-d matrices in one field and the label vector in another?

推荐答案

LMDB文件和MS Access文件。

There is no connection between LMDB files and MS Access files.

我看到它有两个选项：

使用convert_imageset工具 - 它位于工具文件夹下的caffe中，可将图片文件列表和标签转换为lmdb。

图像数据层作为到网络的输入。这种类型的图层将具有图像文件名和标签列表的文件作为源，所以您不必构建数据库（培训的另一个好处 - 您可以使用shuffle选项并获得稍微更好的训练结果）

为了使用图像数据层，只需将图层类型从Data更换为ImageData即可。源文件是文件的路径，在每行中包含图像文件的路径和由空格分隔的标签。例如：

In order to use an image data layer just replace the layer type from Data to ImageData. The source file is the path to a file containing in each line a path of an image file and the label seperated by space. For example:

/path/to/filnename.png 23

如果要对数据进行一些预处理，而不将预处理的文件保存到磁盘，可以使用caffe（镜像和裁剪）提供的转换）或实施您自己的 DataTransformer 。

If you want to do some preprocessing of the data without saving the preprocessed file to disk you can use the transformations available by caffe (mirror and cropping) (see here for information http://caffe.berkeleyvision.org/tutorial/data.html) or implement your own DataTransformer.

这篇关于LMDB文件以及它们如何用于caffe深度学习网络的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！