如何使用张量流执行k倍交叉验证?

本文介绍了如何使用张量流执行k倍交叉验证?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我现在的情况是我将所有数据保存在一个CSV文件中，没有分开，我想对该数据应用k倍交叉验证.

My case now is I have all data in a single CSV file, not separated, and I want to apply k-fold cross validation on that data.

我有

data_set = tf.contrib.learn.datasets.base.load_csv(filename="mydata.csv",
                                                   target_dtype=np.int)

与IRIS示例一样，如何使用多层神经网络对该数据集执行k倍交叉验证?

How can I perform k-fold cross validation on this dataset with multi-layer neural network as same as IRIS example?

推荐答案

我知道这个问题很旧，但是如果有人希望做类似的事情，请在:

I know this question is old but in case someone is looking to do something similar, expanding on ahmedhosny's answer:

新的tensorflow数据集API能够使用python生成器创建数据集对象，因此与scikit-learn的KFold一起使用的一个选项是从KFold.split()生成器创建数据集:

The new tensorflow datasets API has the ability to create dataset objects using python generators, so along with scikit-learn's KFold one option can be to create a dataset from the KFold.split() generator:

import numpy as np

from sklearn.model_selection import LeaveOneOut,KFold

import tensorflow as tf
import tensorflow.contrib.eager as tfe
tf.enable_eager_execution()

from sklearn.datasets import load_iris
data = load_iris()
X=data['data']
y=data['target']

def make_dataset(X_data,y_data,n_splits):

    def gen():
        for train_index, test_index in KFold(n_splits).split(X_data):
            X_train, X_test = X_data[train_index], X_data[test_index]
            y_train, y_test = y_data[train_index], y_data[test_index]
            yield X_train,y_train,X_test,y_test

    return tf.data.Dataset.from_generator(gen, (tf.float64,tf.float64,tf.float64,tf.float64))

dataset=make_dataset(X,y,10)

然后可以在基于图的张量流中或使用急切的执行来遍历数据集.使用渴望执行:

Then one can iterate through the dataset either in the graph based tensorflow or using eager execution. Using eager execution:

for X_train,y_train,X_test,y_test in tfe.Iterator(dataset):
    ....

这篇关于如何使用张量流执行k倍交叉验证?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！