问题描述
我的训练图像是其关联 HR 图像的缩小版本.因此,输入和输出图像的维度不同.目前,我使用的是手工制作的 13 张图像样本,但最终我希望能够使用我的 500 小时 HR(高分辨率)图像数据集.然而,这个数据集没有相同尺寸的图像,所以我猜我必须裁剪它们以获得统一的尺寸.
我目前设置了这个代码:它需要一堆 512x512x3
图像并应用一些转换来增加数据(翻转).因此,我以 HR 形式获得了一组基本的 39 张图像,然后将它们缩小了 4 倍,从而获得了包含尺寸 128x128x3
的 39 张图像的训练集.
将 numpy 导入为 np从 keras.preprocessing.image 导入 ImageDataGenerator将 matplotlib.image 导入为 mpimg导入图像从 skimage 导入转换从常量导入 data_path从常量导入 img_width从常量导入 img_height从模型导入 setUpModeldef setUpImages():火车 = []最终测试 = []sample_amnt = 11max_amnt = 13# 提取图像 (512x512)对于我在范围内(sample_amnt):train.append(mpimg.imread(data_path + str(i) + '.jpg'))对于我在范围内(max_amnt-sample_amnt):finalTest.append(mpimg.imread(data_path + str(i+sample_amnt) + '.jpg'))# # 待办事项:https://keras.io/preprocessing/image/# ImageDataGenerator(featurewise_center=False, samplewise_center=False, featurewise_std_normalization=False,# samplewise_std_normalization=False, zca_whitening=False, zca_epsilon=1e-06, rotation_range=0,#width_shift_range=0.0,height_shift_range=0.0,brightness_range=None,shear_range=0.0,# zoom_range=0.0, channel_shift_range=0.0, fill_mode='nearest', cval=0.0, horizontal_flip=False,# vertical_flip=False, rescale=None, preprocessing_function=None, data_format=None,#validation_split=0.0, dtype=None)# 增加数据trainData = dataAugmentation(train)testData = dataAugmentation(finalTest)设置数据(火车数据,测试数据)def setUpData(trainData, testData):# print(type(trainData)) # # 打印(len(trainData)) # 64# print(type(trainData[0])) # # 打印(trainData[0].shape) # (1400, 1400, 3)# 打印(trainData[len(trainData)//2-1].shape) # (1400, 1400, 3)# 打印(trainData[len(trainData)//2].shape) # (350, 350, 3)# 打印(trainData[len(trainData)-1].shape) # (350, 350, 3)# TODO:将所有图像的均值减去所有图像# 分离训练数据Y_train = trainData[:len(trainData)//2] # 前半部分是未改动的数据X_train = trainData[len(trainData)//2:] #后半部分是劣化数据# 分离测试数据Y_test = testData[:len(testData)//2] # 前半部分是未改动的数据X_test = testData[len(testData)//2:] #后半部分是劣化数据# 调整 Keras 输入的形状 # TODO: make into a function ?X_train = np.array([x for x in X_train])Y_train = np.array([x for x in Y_train])Y_test = np.array([x for x in Y_test])X_test = np.array([x for x in X_test])# # 健全性检查:显示四张图像(2x HR/LR)# plt.figure(figsize=(10, 10))# for i in range(2):# plt.subplot(2, 2, i + 1)# plt.imshow(Y_train[i], cmap=plt.cm.binary)# for i in range(2):# plt.subplot(2, 2, i + 1 + 2)# plt.imshow(X_train[i], cmap=plt.cm.binary)# plt.show()设置模型(X_train,Y_train,X_test,Y_test)# TODO:集成 Keras 预处理后可能会删除吗?def dataAugmentation(dataToAugment):print("开始扩充数据")arrayToFill = []# 值介于 0 和 1 之间的更快计算?dataToAugment = np.divide(dataToAugment, 255.)# TODO: 从 RGB 通道切换到 CbCrY# # TODO:尝试灰度# trainingData = np.array(# [(cv2.cvtColor(np.uint8(x * 255), cv2.COLOR_BGR2GRAY)/255).reshape(350, 350, 1) for x in trainingData])#validateData = np.array(# [(cv2.cvtColor(np.uint8(x * 255), cv2.COLOR_BGR2GRAY)/255).reshape(1400, 1400, 1) for x in validateData])# 添加普通图像 (8)对于我在范围内(len(dataToAugment)):arrayToFill.append(dataToAugment[i])# 垂直轴翻转 (-> 16)对于我在范围内(len(arrayToFill)):arrayToFill.append(np.fliplr(arrayToFill[i]))# 水平轴翻转 (-> 32)对于我在范围内(len(arrayToFill)):arrayToFill.append(np.flipud(arrayToFill[i]))# 按比例缩小 4(-> 64 张 128x128x3 的图像)对于我在范围内(len(arrayToFill)):arrayToFill.append(skimage.transform.resize(arrayToFill[i],(img_width/4, img_height/4),模式='反射',anti_aliasing=True))# # 健全性检查:显示图像# plt.figure(figsize=(10, 10))# for i in range(64):# plt.subplot(8, 8, i + 1)# plt.imshow(arrayToFill[i], cmap=plt.cm.binary)# plt.show()返回 np.array(arrayToFill)
我的问题是:就我而言,我可以使用 Keras 提供的预处理工具吗?理想情况下,我希望能够输入不同尺寸的高质量图像,将它们裁剪(而不是缩小尺寸)到 512x512x3
,并通过翻转等方式对它们进行数据增强.减去平均值也是我想要实现的目标的一部分.该集合将代表我的验证集.
重用验证集,我想将所有图像缩小 4 倍,这将生成我的训练集.
然后可以适当地拆分这两个集合以获得最终著名的X_train
Y_train
X_test
Y_test
.
我只是犹豫要不要扔掉我迄今为止所做的所有工作来预处理我的小样本,但我在想是否可以用一个内置函数来完成,也许我应该这样做去吧.
这是我的第一个 ML 项目,因此我不太了解 Keras,并且文档并不总是最清楚的.我在想,我正在使用大小不同的 X 和 Y,这一事实可能不适用于我的项目.
谢谢!:)
Christof Henkel 的建议非常干净和友好.我只想提供另一种方法来使用
然后可以通过以下方式生成您的图像集:
crops_per_image = 10images = [skimage.io.imread(path) for glob.glob('train_data/*.jpg')]augs = np.array([seq.augment_image(img)/255 for img in images for _ in range(crops_per_image)])
添加要应用于图像的新函数也很简单,例如您提到的删除均值函数.
My training images are downscaled versions of their associated HR image. Thus, the input and the output images aren't the same dimension. For now, I'm using a hand-crafted sample of 13 images, but eventually I would like to be able to use my 500-ish HR (high-resolution) images dataset. This dataset, however, does not have images of the same dimension, so I'm guessing I'll have to crop them in order to obtain a uniform dimension.
I currently have this code set up: it takes a bunch of 512x512x3
images and applies a few transformations to augment the data (flips). I thus obtain a basic set of 39 images in their HR form, and then I downscale them by a factor of 4, thus obtaining my trainset which consits of 39 images of dimension 128x128x3
.
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
import matplotlib.image as mpimg
import skimage
from skimage import transform
from constants import data_path
from constants import img_width
from constants import img_height
from model import setUpModel
def setUpImages():
train = []
finalTest = []
sample_amnt = 11
max_amnt = 13
# Extracting images (512x512)
for i in range(sample_amnt):
train.append(mpimg.imread(data_path + str(i) + '.jpg'))
for i in range(max_amnt-sample_amnt):
finalTest.append(mpimg.imread(data_path + str(i+sample_amnt) + '.jpg'))
# # TODO: https://keras.io/preprocessing/image/
# ImageDataGenerator(featurewise_center=False, samplewise_center=False, featurewise_std_normalization=False,
# samplewise_std_normalization=False, zca_whitening=False, zca_epsilon=1e-06, rotation_range=0,
# width_shift_range=0.0, height_shift_range=0.0, brightness_range=None, shear_range=0.0,
# zoom_range=0.0, channel_shift_range=0.0, fill_mode='nearest', cval=0.0, horizontal_flip=False,
# vertical_flip=False, rescale=None, preprocessing_function=None, data_format=None,
# validation_split=0.0, dtype=None)
# Augmenting data
trainData = dataAugmentation(train)
testData = dataAugmentation(finalTest)
setUpData(trainData, testData)
def setUpData(trainData, testData):
# print(type(trainData)) # <class 'numpy.ndarray'>
# print(len(trainData)) # 64
# print(type(trainData[0])) # <class 'numpy.ndarray'>
# print(trainData[0].shape) # (1400, 1400, 3)
# print(trainData[len(trainData)//2-1].shape) # (1400, 1400, 3)
# print(trainData[len(trainData)//2].shape) # (350, 350, 3)
# print(trainData[len(trainData)-1].shape) # (350, 350, 3)
# TODO: substract mean of all images to all images
# Separating the training data
Y_train = trainData[:len(trainData)//2] # First half is the unaltered data
X_train = trainData[len(trainData)//2:] # Second half is the deteriorated data
# Separating the testing data
Y_test = testData[:len(testData)//2] # First half is the unaltered data
X_test = testData[len(testData)//2:] # Second half is the deteriorated data
# Adjusting shapes for Keras input # TODO: make into a function ?
X_train = np.array([x for x in X_train])
Y_train = np.array([x for x in Y_train])
Y_test = np.array([x for x in Y_test])
X_test = np.array([x for x in X_test])
# # Sanity check: display four images (2x HR/LR)
# plt.figure(figsize=(10, 10))
# for i in range(2):
# plt.subplot(2, 2, i + 1)
# plt.imshow(Y_train[i], cmap=plt.cm.binary)
# for i in range(2):
# plt.subplot(2, 2, i + 1 + 2)
# plt.imshow(X_train[i], cmap=plt.cm.binary)
# plt.show()
setUpModel(X_train, Y_train, X_test, Y_test)
# TODO: possibly remove once Keras Preprocessing is integrated?
def dataAugmentation(dataToAugment):
print("Starting to augment data")
arrayToFill = []
# faster computation with values between 0 and 1 ?
dataToAugment = np.divide(dataToAugment, 255.)
# TODO: switch from RGB channels to CbCrY
# # TODO: Try GrayScale
# trainingData = np.array(
# [(cv2.cvtColor(np.uint8(x * 255), cv2.COLOR_BGR2GRAY) / 255).reshape(350, 350, 1) for x in trainingData])
# validateData = np.array(
# [(cv2.cvtColor(np.uint8(x * 255), cv2.COLOR_BGR2GRAY) / 255).reshape(1400, 1400, 1) for x in validateData])
# adding the normal images (8)
for i in range(len(dataToAugment)):
arrayToFill.append(dataToAugment[i])
# vertical axis flip (-> 16)
for i in range(len(arrayToFill)):
arrayToFill.append(np.fliplr(arrayToFill[i]))
# horizontal axis flip (-> 32)
for i in range(len(arrayToFill)):
arrayToFill.append(np.flipud(arrayToFill[i]))
# downsizing by scale of 4 (-> 64 images of 128x128x3)
for i in range(len(arrayToFill)):
arrayToFill.append(skimage.transform.resize(
arrayToFill[i],
(img_width/4, img_height/4),
mode='reflect',
anti_aliasing=True))
# # Sanity check: display the images
# plt.figure(figsize=(10, 10))
# for i in range(64):
# plt.subplot(8, 8, i + 1)
# plt.imshow(arrayToFill[i], cmap=plt.cm.binary)
# plt.show()
return np.array(arrayToFill)
My question is: in my case, can I use the Preprocessing tool that Keras offers? I would ideally like to be able to input my varying sized images of high quality, crop them (not downsize them) to 512x512x3
, and data augment them through flips and whatnot. Substracting the mean would also be part of what I'd like to achieve. That set would represent my validation set.
Reusing the validation set, I want to downscale by a factor of 4 all the images, and that would generate my training set.
Those two sets could then be split appropriately to obtain, ultimately, the famous X_train
Y_train
X_test
Y_test
.
I'm just hesitant about throwing out all the work I've done so far to preprocess my mini sample, but I'm thinking if it can all be done with a single built-in function, maybe I should give that a go.
This is my first ML project, hence me not understanding very well Keras, and the documentation isn't always the clearest. I'm thinking that the fact that I'm working with a X and Y that are different in size, maybe this function doesn't apply to my project.
Thank you! :)
Christof Henkel's suggestion is very clean and nice. I would just like to offer another way to do it using imgaug, a convenient way to augment images in lots of different ways. It's usefull if you want more implemented augmentations or if you ever need to use some ML library other than Keras.
It unfortunatly doesn't have a way to make crops that way but it allows implementing custom functions. Here is an example function for generating random crops of a set size from an image that's at least as big as the chosen crop size:
from imgaug import augmenters as iaa
def random_crop(images, random_state, parents, hooks):
crop_h, crop_w = 128, 128
new_images = []
for img in images:
if (img.shape[0] >= crop_h) and (img.shape[1] >= crop_w):
rand_h = np.random.randint(0, img.shape[0]-crop_h)
rand_w = np.random.randint(0, img.shape[1]-crop_w)
new_images.append(img[rand_h:rand_h+crop_h, rand_w:rand_w+crop_w])
else:
new_images.append(np.zeros((crop_h, crop_w, 3)))
return np.array(new_images)
def keypoints_dummy(keypoints_on_images, random_state, parents, hooks):
return keypoints_on_images
cropper = iaa.Lambda(func_images=random_crop, func_keypoints=keypoints_dummy)
You can then combine this function with any other builtin imgaug function, for example the flip functions that you're already using like this:
seq = iaa.Sequential([cropper, iaa.Fliplr(0.5), iaa.Flipud(0.5)])
This function could then generate lots of different crops from each image. An example image with some possible results (note that it would result in actual (128, 128, 3) images, they are just merged into one image here for visualization):
Your image set could then be generated by:
crops_per_image = 10
images = [skimage.io.imread(path) for path in glob.glob('train_data/*.jpg')]
augs = np.array([seq.augment_image(img)/255 for img in images for _ in range(crops_per_image)])
It would also be simple to add new functions to be applied to the images, for example the remove mean functions you mentioned.
这篇关于Keras 图像预处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!