What is the class of this image ?

主要是以下常见的数据集,用以衡量算法的分类准确率:

  • mnist、cifar-10、cifar-100stl-10
  • svhn、ILSVRC2012 task 1

1. cifar-10

CIFAR-10 and CIFAR-100 datasets

  • cifar-10-batches-py(Python 接口)

    import os
    import pickle
    import numpy as np def load_CIFAR10_batch(filename):
    with open(filename, 'rb') as f:
    data = pickle.load(f, encoding='latin1')
    X = data['data']
    y = data['labels']
    X = X.reshape(-1, 3, 32, 32).transpose(0, 2, 3, 1).astype(np.float32)
    y = np.array(y)
    return X, y def load_CIFAR10(root):
    xs, ys = [], []
    for n in range(1, 6):
    filename = os.path.join(root, 'data_batch_{}'.format(n))
    X, y = load_CIFAR10_batch(filename)
    xs.append(X)
    ys.append(y)
    Xtr = np.concatenate(xs)
    Ytr = np.concatenate(ys)
    Xte, Yte = load_CIFAR10_batch(os.path.join(root, 'test_batch'))
    return Xtr, Ytr, Xte, Yte

    对于描述数据信息的信息(batches.meta),仍然可以使用 pickle.load 的形式加载,加载的结果仍然是一个字典类型:

    with open('batches.meta', 'rb') as f:
    data = pickle.load(f, encoding='latin1')
    print(data) {'label_names': ['airplane',
    'automobile',
    'bird',
    'cat',
    'deer',
    'dog',
    'frog',
    'horse',
    'ship',
    'truck'],
    'num_cases_per_batch': 10000,
    'num_vis': 3072}
  • cifar-10-batches-mat(matlab 接口)

    最方便的方式是调用 matlab 内置已封装好的 api,helperCIFAR10Data.download/load,或者使用 edit helperCIFAR10Data查看其实现;

    function [train_x, train_y, test_x, test_y] = load_cifar(filepath)
    
        train_x = []; train_y = [];
    for i = 1:5
    filename = fullfile(filepath, sprintf('data_batch_%d.mat', i));
    [batch_train, batch_labels] = load_batch_as_4d_tensor(filename, true);
    train_x = cat(4, train_x, batch_train);
    train_y = [train_y; batch_labels];
    end
    filename = fullfile(filepath, 'test_batch.mat');
    [test_x, test_y] = load_batch_as_4d_tensor(filename, true);
    end function [train_x, train_y] = load_batch_as_4d_tensor(filename, to_categorical)
    % 这里的 x_train 是 4 维的 tensor, 32*32*3*num
    if ~exist('to_categorical', 'var') || isempty(to_categorical)
    to_categorical = false;
    end
    load(filename);
    train_x = reshape(data', 32, 32, 3, []);
    train_x = permute(train_x, [2, 1, 3, 4]); % 互换第一维和第二维
    train_y = labels;
    if to_categorical
    metafile = fullfile(fileparts(filename), 'batches.meta.mat');
    load(metafile);
    train_y = categorical(train_y, 0:9, label_names);
    end end
05-11 18:08