本文介绍了添加GPU卡会自动扩展张量流使用吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我可以使用样本量 N ,批量大小 M 和网络深度进行训练使用tensorflow的GTX 1070卡上的L 。现在,假设我想训练更大的样本2N和/或更深的网络2L并遇到内存不足错误。

Suppose I can train with sample size N, batch size M and network depth L on my GTX 1070 card with tensorflow. Now, suppose I want to train with larger sample 2N and/or deeper network 2L and getting out of memory error.

将插入其他GPU卡会自动解决此问题(假设,那么所有GPU卡的内存总量足以容纳批处理及其渐变)?

Will plugging additional GPU cards automatically solve this problem (suppose, that total amount of memory of all GPU cards is sufficient to hold batch and it's gradients)? Or it is impossible with pure tensorflow?

我读过,有比特币或以太坊矿工,可以用多个GPU卡构建采矿场,并且该场将

I'v read, that there are bitcoin or etherium miners, that can build mining farm with multiple GPU cards and that this farm will mine faster.

采矿场对于深度学习是否也会表现更好?

Will mining farm also perform better for deep learning?

推荐答案

否。您必须更改Tensorflow代码以显式地计算不同设备上的不同操作(例如:在每个GPU上单批计算梯度,然后将计算出的梯度发送到协调器,以累积接收到的梯度并更新平均这些梯度的模型参数)。

No. You have to change your Tensorflow code to explicitly compute different operations on different devices (e.g: compute the gradients over a single batch on every GPU, then send the computed gradients to a coordinator that accumulates the received gradients and updates the model parameters averaging these gradients).

Tensorflow非常灵活,允许您为每个不同的设备(或相同的远程节点)指定不同的操作。
您可以在单个计算节点上进行数据扩充,然后让其他节点在不应用此功能的情况下处理数据。您只能在一个设备或一组设备上执行某些操作。

Also, Tensorflow is so flexible that allows you to specify different operations for every different device (or different remote nodes, it's the same).You could do data augmentation on a single computational node and let the others process the data without applying this function. You can execute certain operation on a device or set of devices only.

可以使用tensorflow,但是您必须更改为单个火车/推理设备编写的代码。

It's possible with tensorflow, but you have to change the code you wrote for a single train/inference device.

使用POW(工作量证明)工作的区块链需要解决使用类似蛮力的方法解决一个困难的问题(他们使用不同的输入来计算大量的哈希值,直到他们找到有效的哈希值为止)。

Blockchains that work using POW (Proof Of Work) requires to solve a difficult problem using a brute-force like approach (they compute a lot's of hash with different inputs until they found a valid hash).

这意味着,如果您使用单个GPU可以猜测1000个哈希/秒,两个相同的GPU可以猜测2 x 1000个哈希/秒。

That means that if your single GPU can guess 1000 hash/s, 2 identical GPUs can guess 2 x 1000 hash/s.

GPU所做的计算是完全不相关的:GPU产生的数据GPU:1未使用:0,并且计算之间没有同步点。这意味着一个GPU可以完成的任务可以由另一个GPU并行执行(每个GPU显然具有不同的输入,因此设备可以计算散列来解决网络给出的不同问题)。

The computation the GPUs are doing are completely uncorrelated: the data produced by the GPU:0 is not used by the GPU:1 and there are no synchronization points between the computations. This means that the task that a GPU do can be executed in parallel by another GPU (obviously with different inputs per GPU, so the devices compute hashes to solve different problems given by the network)

返回Tensorflow:修改代码以使其与其他GPU配合使用后,您可以更快地训练网络(简而言之,因为您使用的是较大的批次)

Back to Tensorflow: once you modified your code to work with different GPUs, you could train your network faster (in short because you're using bigger batches)

这篇关于添加GPU卡会自动扩展张量流使用吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-06 05:54