本文介绍了使用Tensorflow和Inception V3预训练模型训练高清晰度图像的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在对要转换为图像的PDF文档进行图像分类.我正在使用tensorflow inception v3预先训练的模型并尝试在tensorflow tuto 后使用我自己的类别重新训练最后一层.我每个类别只有1000个训练图像,只有4个类别.通过200k次迭代,我可以达到成功分类的90%,这虽然不错,但仍然需要一些工作:

I'm looking to do some image classification on PDF documents that I convert to images. I'm using tensorflow inception v3 pre trained model and trying to retrain the last layer with my own categories following the tensorflow tuto. I have ~1000 training images per category and only 4 categories. With 200k iterations I can reach up to 90% of successful classifications, which is not bad but still need some work:

这里的问题是这种经过预先训练的模型仅需要300 * 300p的图像作为输入.显然,它与我尝试在文档中识别的功能所涉及的字符搞混了.

The issue here is this pre-trained model takes only 300*300p images for input. Obviously it messes up a lot with the characters involved in the features I try to recognize in the documents.

是否可以更改模型输入层,以便为他提供分辨率更高的图像?

Would it be possible to alter the model input layer so I can give him images with better resolution ?

使用自制且更简单的模型可以获得更好的结果吗?

Would I get better results with a home made and way simpler model ?

如果是这样,我应该从哪里开始建立这种图像分类的模型?

If so, where should I start to build a model for such image classification ?

推荐答案

如果要使用与预训练模型不同的图像分辨率,则应仅使用卷积块,并使用一组完全连接的块尊重新尺寸.使用Keras之类的更高级别的库将使其变得更加容易.以下是在Keras中执行此操作的示例.

If you want to use a different image resolution than the pre-trained model uses , you should use only the convolution blocks and have a set of fully connected blocks with respect to the new size. Using a higher level library like Keras will make it a lot easier. Below is an example on how to do that in Keras.

import keras
from keras.layers import Flatten,Dense,GlobalAveragePooling2D
from keras.models import Model
from keras.applications.inception_v3 import InceptionV3

base_model = InceptionV3(include_top=False,input_shape=(600,600,3),weights='imagenet')
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024,activation='relu')(x)
#Add as many dense layers / Fully Connected layers required
pred = Dense(10,activation='softmax')(x)
model = Model(base_model.input,pred)

for l in model.layers[:-3]:
    l.trainable=False

input_top = False仅会给您卷积块.您可以使用input_shape=(600,600,3)设置所需的形状.您可以在模型中添加几个密集块/完全连接的块/层.最后一层应包含所需数量的类别.10代表类数.通过这种方法,您可以使用与卷积层相关的所有权重预先训练的模型,只训练最后的致密层.

The input_top = False will give you only the convolution blocks. You can use the input_shape=(600,600,3) to set the required shape you want. And you can add a couple of dense blocks/Fully connected blocks/layers to the model.The last layer should contain the required number of categories .10 represent the number of classes.By this approach you use all the weights associated with the convolution layers of the pre trained model and train only the last dense layers.

这篇关于使用Tensorflow和Inception V3预训练模型训练高清晰度图像的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-28 21:55