问题描述
我正在对包含多种尺寸图像的数据集进行训练/测试ML模型.我知道Keras允许我们使用target_size
参数提取固定大小的随机补丁:
I'm training/testing ML models over a dataset containing images of multiple sizes. I know Keras allows us to extract a random patch of fixed size using the target_size
parameter:
gen = ImageDataGenerator(width_shift_range=.9, height_shift_range=.9)
data = gen.flow_from_directory('/path/to/dataset/train',
target_size=(224, 224),
classes=10,
batch_size=32,
seed=0)
for _ in range(data.N // data.batch_size):
X, y = next(data)
对于每个迭代,X
包含32个补丁(每个不同的样本一个).在所有迭代中,我都可以访问数据集中每个样本的一个补丁.
For each iteration, X
contains 32 patches (one for each different sample). Across all iterations, I have access to one patch of each sample in the dataset.
问题:提取同一样品的多个贴片的最佳方法是什么?
Question: what is the best way to extract MULTIPLE patches of a same sample?
类似的东西:
data = gen.flow_from_directory(..., nb_patches=10)
X, y = next(data)
# X contains 320 rows (10 patches for each 32 sample in the batch)
我知道我可以编写第二个for循环并在数据集上迭代多次,但这似乎有些混乱.我还想更坚决地保证自己确实在获取样本样品的补丁.
I know I can write a second for loop and iterate multiple times over the dataset, but this seems a little bit messy. I also would like to have a more strong guarantee that I am really fetching patches of a sample sample.
推荐答案
我决定自己实现它.那就是它的结局:
I decided to implement it myself. That's how it ended up:
n_patches = 10
labels = ('class1', 'class2', ...)
for label in labels:
data_dir = os.path.join('path-to-dir', label)
for name in os.listdir(data_dir):
full_name = os.path.join(data_dir, name)
img = Image.open(full_name).convert('RGB')
patches = []
for patch in range(n_patches):
start = (np.random.rand(2) * (img.width - image_shape[1],
img.height -image_shape[0])).astype('int')
end = start + (image_shape[1], image_shape[0])
patches.append(img_to_array(img.crop((start[0], start[1],
end[0], end[1]))))
X.append(patches)
y.append(label)
X, y = np.array(X, dtype=np.float), np.array(y, dtype=np.float)
这篇关于使用Keras提取同一图像的多个补丁的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!