问题描述
我想知道是否有任何合理的方法可以使用 tff 核心代码为联邦学习模拟生成客户端数据集?在联邦核心的教程中,它使用 MNIST 数据库,每个客户端在他的数据集中只有一个不同的标签.在这种情况下,只有 10 个不同的标签可用.如果我想拥有更多客户,我该怎么做?提前致谢.
I am wondering if there are any reasonable ways to generate clients data sets for federated learning simulation using tff core code? In the tutorial for the federated core, it uses the MNIST database with each client has only one distinct label in his data set. In this case, there are only 10 different labels available. If I want to have more clients, how can I do that? Thanks in advance.
推荐答案
如果您想从头开始创建数据集,可以使用 tff.simulation.FromTensorSlicesClientData 将张量转换为 tff clientdata 对象.只需要传递以客户端 ID 作为键和数据集作为值的字典.
If you want to create a dataset from scratch you can use tff.simulation.FromTensorSlicesClientData to covert tensors to tff clientdata object. Just you need to pass dictionary having client id as key and dataset as value.
client_train_dataset = collections.OrderedDict()
for i in range(1, split+1):
client_name = "client_" + str(i)
start = image_per_set * (i-1)
end = image_per_set * i
print(f"Adding data from {start} to {end} for client : {client_name}")
data = collections.OrderedDict((('label', y_train[start:end]), ('pixels', x_train[start:end])))
client_train_dataset[client_name] = data
train_dataset = tff.simulation.FromTensorSlicesClientData(client_train_dataset)
您可以在此处查看我的完整实现,其中我已将 mnist 拆分为 4 个客户端.
you can check my complete implementation here, where i have splitted mnist into 4 clients.
这篇关于有没有一种合理的方法来创建 tff 客户端数据集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!