本文介绍了如何在C ++中读取MNIST数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我在C ++中阅读时遇到问题。
I'm having trouble reading the MNIST database of handwritten digits in C++.
这是一个二进制格式,我知道如何阅读,但我不知道MNIST的确切格式。
It's in a binary format, which I know how to read, but I don't know the exact format of MNIST.
因此,我想询问有关MNIST数据的MNIST数据的读者,你有什么建议如何在C ++中读取这些数据? / p>
Therefore, I want to ask people who have read the MNIST data about the format of MNIST data and do you have any suggestions for how to read this data in C++?
推荐答案
我最近对MNIST数据做了一些工作。下面是我在Java中写的一些代码,你应该很容易移植:
I did some work with the MNIST data recently. Here's some code that I wrote in Java that should be pretty easy for you to port over:
import net.vivin.digit.DigitImage;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.ByteBuffer;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
/**
* Created by IntelliJ IDEA.
* User: vivin
* Date: 11/11/11
* Time: 10:07 AM
*/
public class DigitImageLoadingService {
private String labelFileName;
private String imageFileName;
/** the following constants are defined as per the values described at http://yann.lecun.com/exdb/mnist/ **/
private static final int MAGIC_OFFSET = 0;
private static final int OFFSET_SIZE = 4; //in bytes
private static final int LABEL_MAGIC = 2049;
private static final int IMAGE_MAGIC = 2051;
private static final int NUMBER_ITEMS_OFFSET = 4;
private static final int ITEMS_SIZE = 4;
private static final int NUMBER_OF_ROWS_OFFSET = 8;
private static final int ROWS_SIZE = 4;
public static final int ROWS = 28;
private static final int NUMBER_OF_COLUMNS_OFFSET = 12;
private static final int COLUMNS_SIZE = 4;
public static final int COLUMNS = 28;
private static final int IMAGE_OFFSET = 16;
private static final int IMAGE_SIZE = ROWS * COLUMNS;
public DigitImageLoadingService(String labelFileName, String imageFileName) {
this.labelFileName = labelFileName;
this.imageFileName = imageFileName;
}
public List<DigitImage> loadDigitImages() throws IOException {
List<DigitImage> images = new ArrayList<DigitImage>();
ByteArrayOutputStream labelBuffer = new ByteArrayOutputStream();
ByteArrayOutputStream imageBuffer = new ByteArrayOutputStream();
InputStream labelInputStream = this.getClass().getResourceAsStream(labelFileName);
InputStream imageInputStream = this.getClass().getResourceAsStream(imageFileName);
int read;
byte[] buffer = new byte[16384];
while((read = labelInputStream.read(buffer, 0, buffer.length)) != -1) {
labelBuffer.write(buffer, 0, read);
}
labelBuffer.flush();
while((read = imageInputStream.read(buffer, 0, buffer.length)) != -1) {
imageBuffer.write(buffer, 0, read);
}
imageBuffer.flush();
byte[] labelBytes = labelBuffer.toByteArray();
byte[] imageBytes = imageBuffer.toByteArray();
byte[] labelMagic = Arrays.copyOfRange(labelBytes, 0, OFFSET_SIZE);
byte[] imageMagic = Arrays.copyOfRange(imageBytes, 0, OFFSET_SIZE);
if(ByteBuffer.wrap(labelMagic).getInt() != LABEL_MAGIC) {
throw new IOException("Bad magic number in label file!");
}
if(ByteBuffer.wrap(imageMagic).getInt() != IMAGE_MAGIC) {
throw new IOException("Bad magic number in image file!");
}
int numberOfLabels = ByteBuffer.wrap(Arrays.copyOfRange(labelBytes, NUMBER_ITEMS_OFFSET, NUMBER_ITEMS_OFFSET + ITEMS_SIZE)).getInt();
int numberOfImages = ByteBuffer.wrap(Arrays.copyOfRange(imageBytes, NUMBER_ITEMS_OFFSET, NUMBER_ITEMS_OFFSET + ITEMS_SIZE)).getInt();
if(numberOfImages != numberOfLabels) {
throw new IOException("The number of labels and images do not match!");
}
int numRows = ByteBuffer.wrap(Arrays.copyOfRange(imageBytes, NUMBER_OF_ROWS_OFFSET, NUMBER_OF_ROWS_OFFSET + ROWS_SIZE)).getInt();
int numCols = ByteBuffer.wrap(Arrays.copyOfRange(imageBytes, NUMBER_OF_COLUMNS_OFFSET, NUMBER_OF_COLUMNS_OFFSET + COLUMNS_SIZE)).getInt();
if(numRows != ROWS && numRows != COLUMNS) {
throw new IOException("Bad image. Rows and columns do not equal " + ROWS + "x" + COLUMNS);
}
for(int i = 0; i < numberOfLabels; i++) {
int label = labelBytes[OFFSET_SIZE + ITEMS_SIZE + i];
byte[] imageData = Arrays.copyOfRange(imageBytes, (i * IMAGE_SIZE) + IMAGE_OFFSET, (i * IMAGE_SIZE) + IMAGE_OFFSET + IMAGE_SIZE);
images.add(new DigitImage(label, imageData));
}
return images;
}
}
这篇关于如何在C ++中读取MNIST数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!