本文介绍了简单的(有效的)手写数字识别:如何改进?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚写了这个非常简单的手写数字识别. 被很好地识别为.

I just wrote this very simple handwritten digit recoginition. is well recognized as .

简而言之,将数据库的每个数字(50x50像素= 250个系数)汇总为10系数向量(通过保留10个最大的奇异值,请参见使用SVD的低秩近似).

In short, each digit of the database (50x50 pixels = 250 coefficients) is summarized into a 10-coefficient-vector (by keeping the 10 biggest singular values, see Low-rank approximation with SVD).

然后为了识别数字,我们将距离与数据库中的数字最小化.

Then for the digit to be recognized, we minimize the distance with the digits in the database.

from scipy import misc
import numpy as np
import matplotlib.pyplot as plt

digits = []
for i in range(11):
    M = misc.imread(str(i) + '.png', flatten=True)
    U, s, V = np.linalg.svd(M, full_matrices=False)
    s[10:] = 0        # keep the 10 biggest singular values only, discard others
    S = np.diag(s)
    M_reduced = np.dot(U, np.dot(S, V))      # reconstitution of image with 10 biggest singular values
    digits.append({'original': M, 'singular': s[:10], 'reduced': M_reduced})

# each 50x50 pixels digit is summarized into a vector of 10 coefficients : the 10 biggest singular values s[:10]

# 0.png to 9.png = all the digits (for machine training)
# 10.png = the digit to be recognized
toberecognizeddigit = digits[10]
digits = digits[:10]

# we find the nearest-neighbour by minimizing the distance between singular values of toberecoginzeddigit and all the digits in database
recognizeddigit = min(digits[:10], key=lambda d: sum((d['singular']-toberecognizeddigit['singular'])**2))

plt.imshow(toberecognizeddigit['reduced'], interpolation='nearest', cmap=plt.cm.Greys_r)
plt.show()
plt.imshow(recognizeddigit['reduced'], interpolation='nearest', cmap=plt.cm.Greys_r)
plt.show()

问题:

该代码有效(您可以在ZIP存档中运行该代码),但是我们如何才能对其进行改进以取得更好的效果?(我想象中的大多数是数学技术).

Question:

The code works (you can run the code in the ZIP archive), but how can we improve it to have better results? (mostly math techniques I imagine).

例如,在我的测试中,9和3有时会相互混淆.

For example in my tests, 9 and 3 are sometimes confused with each other.

推荐答案

数字识别可能是一个非常困难的领域.尤其是当数字以非常不同或不清楚的方式书写时.为了解决这个问题,已经采取了很多方法,整个竞争都集中在这个问题上.有关示例,请参见 Kaggle的数字识别器竞赛.这项比赛基于众所周知的 MNIST数据集.在存在的论坛中,您会找到很多解决此问题的想法和方法,但我会给出一些快速的建议.

Digit recognition can be a quite difficult area. Especially when the digits are written in very different or unclear ways. A lot of approaches have been taken in an attempt to solve this problem, and entire competions are dedicated to this subject. For an example, see Kaggle's digit recognizer competition. This competition is based on the well known MNIST data set. In the forums that are there, you will find a lot of ideas and approaches to this problem, but I will give some quick suggestions.

许多人将此问题视为分类问题.解决此类问题的可能算法包括,例如kNN,神经网络或梯度提升.

A lot of people approach this problem as a classification problem. Possible algorithms to solve such problems include, for example, kNN, neural networks, or gradient boosting.

但是,通常,仅算法不足以获得最佳分类率.改善分数的另一个重要方面是特征提取.这个想法是要计算可以区分不同数字的特征.该数据集的一些示例功能可能包括彩色像素的数量,或者数字的宽度和高度.

However, generally just the algorithm is not enough to get optimal classification rates. Another important aspect to improve your scores is feature extraction. The idea is to calculate features that make it possible to distinguish between different numbers. Some example features for this dataset might include the number of colored pixels, or maybe the width and the height of the digits.

尽管其他算法可能不是您想要的算法,但添加更多功能也有可能改善您当前正在使用的算法的性能.

Although the other algorithms might not be what you are looking for, it is possible that adding more features can improve the performance of the algorithm you are currently using as well.

这篇关于简单的(有效的)手写数字识别:如何改进?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-21 20:26