问题描述
我想计算 LeNet-5 的每一层有多少 flops ( 纸) 需要.一些论文总共给出了其他架构的 FLOP(1、2, 3) 但是,这些论文没有详细说明如何计算 FLOP 的数量,我也不知道非线性激活函数需要多少 FLOP.例如,计算 tanh(x)代码>
?
I would like to compute how many flops each layer of LeNet-5 (paper) needs. Some papers give FLOPs for other architectures in total (1, 2, 3) However, those papers don't give details on how to compute the number of FLOPs and I have no idea how many FLOPs are necessary for the non-linear activation functions. For example, how many FLOPs are necessary to calculate tanh(x)
?
我想这将是实现,也可能是特定于硬件的.但是,我主要对获得一个数量级感兴趣.我们是在谈论 10 次 FLOPs 吗?100 次 FLOP?1000 次 FLOP?所以选择你想要的任何架构/实现作为你的答案.(尽管我很欣赏接近常见"设置的答案,例如 Intel i5/nvidia GPU/Tensorflow)
I guess this will be implementation and probably also hardware-specific. However, I am mainly interested in getting an order of magnitude. Are we talking about 10 FLOPs? 100 FLOPs? 1000 FLOPs? So chose any architecture / implementation you want for your answer. (Although I'd appreciate answers which are close to "common" setups, like an Intel i5 / nvidia GPU / Tensorflow)
推荐答案
注意:这个答案不是特定于 Python 的,但我认为像 tanh 这样的东西在不同语言之间并没有本质上的不同.
Note: This answer is not python specific, but I don't think that something like tanh is fundamentally different across languages.
Tanh 通常通过定义上限和下限来实现,分别返回 1 和 -1.中间部分用不同的函数近似如下:
Tanh is usually implemented by defining an upper and lower bound, for which 1 and -1 is returned, respectively. The intermediate part is approximated with different functions as follows:
Interval 0 x_small x_medium x_large
tanh(x) | x | polynomial approx. | 1-(2/(1+exp(2x))) | 1
存在精确到单精度浮点和双精度的多项式.该算法称为Cody-Waite算法.
There exist polynomials that are accurate up to single precisision floating points, and also for double precision.This algorithm is called Cody-Waite algorithm.
引用此描述(您可以找到更多还有关于数学的信息,例如如何确定 x_medium),Cody 和 Waite 的有理形式要求单精度四乘三加一除,双精度七乘六加一除.
Citing this description (you can find more information about the mathematics there as well, e.g. how to determine x_medium),Cody and Waite’s rational form requires four multiplications, three additions, and one division in single precision, and seven multiplications, six additions, and one division in double precision.
对于负x,你可以计算|x|并翻转标志.因此,您需要比较 x 所在的区间,并评估相应的近似值.一共是:
For negative x, you can compute |x| and flip the sign.So you need comparisons for which interval x is in, and evaluate the according approximation.That's a total of:
- 取x的绝对值
- 3 个区间比较
- 根据间隔和浮点精度,指数为 0 到几个 FLOPS,检查 这个问题 关于如何计算指数.
- 通过比较来决定是否翻转标志.
现在,这是 1993 年的报告,但我认为这里没有太大变化.
Now, this is a report from 1993, but I don't think much has changed here.
这篇关于tanh 需要多少 FLOP?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!