本文介绍了Tensorflow 如何支持 Cuda 流?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Tensorflow 是否自动利用 Cuda 流在单个 GPU 上并发执行计算图,还是应该将流手动分配给 ops/tensor?

Does Tensorflow utilize Cuda streams automatically for concurrent execution of the computation graph on a single GPU or should streams be assigned manually to ops/tensors ?

推荐答案

目前,TensorFlow 仅使用一个计算流和多个复制流.一些内核可能会选择使用多个流进行计算,同时保持单流语义.

For now, TensorFlow only uses one compute stream, and multiple copy streams. Some kernels may choose to use multiple streams for computation, while maintaining a single-stream semantics.

我们的实验表明,自动启用多流不会带来太多的性能提升,因为我们的大多数内核都足够大,可以利用 GPU 中的所有处理器.但是启用多流会禁用我们当前的设计来积极回收 GPU 内存.

Our experiment showed that enabling multi-stream automatically does not bring much performance gains, since most of our kernels are large enough to utilize all processors in GPU. But enabling multi-stream would disable our current design to recycle GPU memory aggressively.

这是我们将来可能会重新考虑的决定.如果发生这种情况,TensorFlow 很可能会自动将操作/内核分配给不同的 Cuda 流,而不会将它们暴露给用户.

This is a decision we might revisit in the future. If that happens, it is likely for TensorFlow to automatically assign ops/kernels to different Cuda streams, without exposing them to users.

这篇关于Tensorflow 如何支持 Cuda 流?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-01 00:45