问题描述
我想知道为什么对 tf.data.Dataset 样本的 for 循环比在相应的 numpy 数组上循环慢得多.
I'm wondering why a for-loop over samples of a tf.data.Dataset is so much slower than looping over the corresponding numpy array.
import numpy as np
import tensorflow as tf
import time
a = np.ones(100000, dtype=np.float32)
start_time = time.time()
for x in a:
pass
print(time.time() - start_time)
start_time = time.time()
for x in tf.data.Dataset.from_tensor_slices(a):
pass
print(time.time() - start_time)
0.05548405647277832
5.67711615562439
我的 TensorFlow 版本是 2.0.0.
My TensorFlow version is 2.0.0.
推荐答案
是的,即使我观察到了相同的行为.为了提高速度/性能,尝试将 tf.data.dataset
包装在 @tf.function 并且几乎需要相同的时间.
Yes, even i have observed same behavior. To improve speed/performance try wrapping tf.data.dataset
in a @tf.function and it will take almost the same time.
AutoGraph 在 tf.function
中是默认的并将您的 Python 热切代码转换为与图形兼容的 TensorFlow 操作.这包括控制流,如 if
、for
、while
.
AutoGraph is on default in tf.function
and transforms your Python eager code into graph-compatible TensorFlow ops. This includes control flow like if
, for
, while
.
tf.function
最适合 TensorFlow ops,NumPy 和 Python 调用被转换为常量.
tf.function
works best with TensorFlow ops, NumPy and Python calls are converted to constants.
请参考下面显示的代码包裹在@tf.function
Please refer code shown below to wrap within @tf.function
@tf.function
def oper(a):
start_time = time.time()
for x in tf.data.Dataset.from_tensor_slices(a):
pass
print(time.time() - start_time)
numpy
和 tf.data.dataset
性能之间的完整工作代码如下所示
Complete working code shown below between numpy
and tf.data.dataset
performance
import numpy as np
import tensorflow as tf
import time
a = np.ones(100000, dtype=np.float32)
start_time = time.time()
for x in a:
pass
print(time.time() - start_time)
@tf.function
def oper(a):
start_time = time.time()
for x in tf.data.Dataset.from_tensor_slices(a):
pass
print(time.time() - start_time)
oper(a)
输出:
0.012496232986450195
0.017792224884033203
要了解有关 tf.function
的更多信息,请参阅this.
To know more about tf.function
please refer this.
这篇关于循环 tf.data.Dataset 非常慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!