我检查了this Tutorial,却想不出一种实际使用DataLoader训练ANN的方法。
遍历我的DataLoader时,会弹出一个cmd提示,并立即将其自身关闭,此后没有任何反应。我的原始数据都是np.arrays。
import torch
from torch.utils import data
import numpy as np
class Dataset(data.Dataset):
'Characterizes a dataset for PyTorch'
def __init__(self, datax, labels):
'Initialization'
self.labels = torch.tensor(labels)
self.datax = torch.tensor(datax)
self.len = len(datax)
def __len__(self):
'Denotes the total number of samples'
return self.len
def __getitem__(self, index):
'Generates one sample of data'
# Load data and get label
X = self.datax[index]
y = self.labels[index]
return X, y
params = {'batch_size': 64,
'shuffle': True,
'num_workers': 1}
training_set = Dataset(datax=X, labels=labels)
training_generator = data.DataLoader(training_set, **params)
for x in training_generator:
print(1)
我尝试了很多次,对命令提示符一目了然,
OMP: Info #212: KMP_AFFINITY: decoding x2APIC ids.
OMP: Info #210: KMP_AFFINITY: Affinity capable, using global cpuid leaf 11 info
OMP: Info #154: KMP_AFFINITY: Initial OS proc set respected: 0
OMP: Info #156: KMP_AFFINITY: 4 available OS procs
OMP: Info #157: KMP_AFFINITY: Uniform topology
OMP: Info #179: KMP_AFFINITY: 1 packages x 2 cores/pkg x 2 threads/core (2 total cores)
OMP: Info #214: KMP_AFFINITY: OS proc to physical thread map:
OMP: Info #171: KMP_AFFINITY: OS proc 0 maps to package 0 core 0 thread 0
OMP: Info #171: KMP_AFFINITY: OS proc 1 maps to package 0 core 0 thread 1
OMP: Info #171: KMP_AFFINITY: OS proc 2 maps to package 0 core 1 thread 0
OMP: Info #171: KMP_AFFINITY: OS proc 3 maps to package 0 core 1 thread 1
OMP: Info #250: KMP_AFFINITY: pid 10264 tid 2388 thread 0 bound to OS proc set 0
OMP: Info #250: KMP_AFFINITY: pid 10264 tid 3288 thread 1 bound to OS proc set 2
最佳答案
这是我的方法:
class myDataset(Dataset):
'''
a dataset for PyTorch
'''
def __init__(self, X, y):
self.X = X
self.y = y
def __getitem__(self, index):
return self.X[index], self.y[index]
def __len__(self):
return len(self.X)
然后您可以简单地添加到加载器中:
full_dataset = myDataset(X,y)
train_loader = DataLoader(full_dataset, batch_size=batch_size)
另外,X,y只是numpy数组。
对于培训,您可以使用for循环访问数据:
for data, target in train_loader:
if train_on_gpu:
data, target = data.double().cuda(), target.double().cuda()