本文介绍了如何训练libsvm格式的图像(像素)数据以用于Java识别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使Java应用程序通过使用libsvm来识别字符,但是当我进入该程序时,我不知道如何训练图像数据与libsvm一起使用?

I want to make a Java application to recognize characters by using libsvm but when get into this, I do not understand how could I train the image data to use with libsvm?

最近要学习它,我使用现有数据进行了测试:

Recently to learn it, I made a test with existing data:

我还通过将每个像素转换为0,1来创建基于32x32的训练图像数据,但是我不知道它是否可以用于创建libsvm训练数据格式?以及libsvm测试数据是如何创建的?

I have also create 32x32 based training image data by convert each pixel to 0,1 but I don't know if it could use to create libsvm training data format?And also how the libsvm testing data created?

已转换图像像素(0,1)的示例:

Example of converted image pixels (0,1):

00000000000001111000000000000000
00000000000011111110000000000000
00000000001111111111000000000000
00000001111111111111100000000000
00000001111111011111100000000000
00000011111110000011110000000000
00000011111110000000111000000000
00000011111110000000111100000000
00000011111110000000011100000000
00000011111110000000011100000000
00000011111100000000011110000000
00000011111100000000001110000000
00000011111100000000001110000000
00000001111110000000000111000000
00000001111110000000000111000000
00000001111110000000000111000000
00000001111110000000000111000000
00000011111110000000001111000000
00000011110110000000001111000000
00000011110000000000011110000000
00000001111000000000001111000000
00000001111000000000011111000000
00000001111000000000111110000000
00000001111000000001111100000000
00000000111000000111111000000000
00000000111100011111110000000000
00000000111111111111110000000000
00000000011111111111110000000000
00000000011111111111100000000000
00000000001111111110000000000000
00000000000111110000000000000000
00000000000011000000000000000000
 0
00000000000001111111110000000000
00000000001111111111111000000000
00000000011111111111111100000000
00000000011111111111111100000000
00000000011111111111111110000000
00000001111111111111111100000000
00000000111110000011111100000000
00000000000000000001111100000000
00000000000000000001111100000000
00000000000000000001111100000000
00000000000000000011111000000000
00000000000000000111111000000000
00000000000000000111111000000000
00000000000000000111111000000000
00000000000000001111110000000000
00000000011111111111111111000000
00000000111111111111111111100000
00000000111111111111111111100000
00000000111111111111111111100000
00000001111111111111111110000000
00000001111111111110000000000000
00000001111111111110000000000000
00000000111111111110000000000000
00000000000011111000000000000000
00000000000011111000000000000000
00000000000011111000000000000000
00000000000111111000000000000000
00000000000111111000000000000000
00000000001111110000000000000000
00000000011111110000000000000000
00000000001111100000000000000000
00000000001111100000000000000000
 7

如何为libsvm (training, testing data)获取它?

推荐答案

libsvm具有特定的数据格式,每一行都是一个训练/测试向量,格式为

libsvm has a specific data format, each line is a one training/testing vector in the form of

因此,在最幼稚"的方法中,您只需通过串联相应的行即可将矩阵表示形式转换为行表示形式,因此图像类似

so in the most "naive" method, you simply convert matrix representation to the row representation by concatenating consequtive rows, so image like

010
011
000

将成为

010011000

并采用libsvm格式(假设我们将其标记为"5"):

and in the libsvm format (assuming we label it with "5"):

5 0:0 1:1 2:0 3:0 4:1 5:1 6:0 7:0 8:0 9:0

由于libsvm支持稀疏"表示,因此可以省略带有"0"的值

as libsvm support "sparse" representation, you can ommit values with "0's"

5 1:1 4:1 5:1

这是一种手动方式,示例数据位于此处: http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/a1a

This is a manual way, sample data is located here: http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary/a1a

最简单的自动"方式是将数据表示为.csv格式(再次-将数据转换为类似行的格式,然后转换为.csv),这是非常标准的方法:

The easiest "automatic" way is to represent your data as a .csv format (again - convert data to the row-like format, then to the .csv), which is quite standard method:

...

然后使用该程序进行转换

and then use this program for conversion

/* convert cvs data to libsvm/svm-light format */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char buf[10000000];
float feature[100000];

int main(int argc, char **argv)
{
    FILE *fp;

    if(argc!=2) { fprintf(stderr,"Usage %s filename\n",argv[0]); }
    if((fp=fopen(argv[1],"r"))==NULL)
    {
        fprintf(stderr,"Can't open input file %s\n",argv[1]);
    }

    while(fscanf(fp,"%[^\n]\n",buf)==1)
    {
        int i=0,j;
        char *p=strtok(buf,",");

        feature[i++]=atof(p);

        while((p=strtok(NULL,",")))
            feature[i++]=atof(p);

        //      --i;
        /*
        if ((int) feature[i]==1)
            printf("-1 ");
        else
            printf("+1 ");
        */
        //      printf("%f ", feature[1]);
        printf("%d ", (int) feature[0]);
        for(j=1;j<i;j++)
            printf(" %d:%f",j,feature[j]);


        printf("\n");
    }
    return 0;
}

两个训练和测试文件的结构都完全相同,只需将数据按一定比例(3:1或9:1)随机分成文件trainingtesting,但请记住要包含均衡数量的训练向量每个文件中的每个类.

Both training and testing files have exactly the same structure, simply split your data in some proportion (3:1 or 9:1) randomly into files training and testing, but remember to include balanced number of training vectors for each class in each file.

特别是-您的数据看起来有点像 MNIST 数据集,如果是这种情况,则已经为libsvm做好了准备:

In particular - your data looks a bit like MNIST dataset, if it is a case, this is already prepared for libsvm:

http://www.csie.ntu.edu .tw/〜cjlin/libsvmtools/datasets/multiclass.html

MNIST培训: http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass/mnist.scale.bz2

MNIST测试: http: //www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/multiclass/mnist.scale.t.bz2

如果可能的话,将图像转换为[0,1]间隔的实值图像将比二进制数据(丢失大量信息)更有价值.

If it possible with your data, converting your images to the real-valued ones in the [0,1] interval would be more valuable then binary data (which loses much information).

编辑

例如,如果您的图像是8位灰度图像,则每个像素实际上是0到255之间的数字v.您现在正在做的是一些阈值设置,将v > T和0设置为1.对于v <= T,将这些值映射到实际值将为模型提供更多信息.可以通过简单挤压v / 255来完成.结果,所有值都在[0,1]间隔内,但也有介于"之间的值,例如0.25等.

As an example, if your image is a 8bit greyscale image, then each pixel is in fact a number v between 0 and 255. What you are now doing, is some thresholding, setting 1 for v > T and 0 for v <= T, while mapping these values to real values would give more information to the model. It can be done by simple squashing v / 255. As a result, all values are in the [0,1] interval, but have also values "in between" like 0.25 etc.

这篇关于如何训练libsvm格式的图像(像素)数据以用于Java识别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 09:50