使用Accelerate框架对无符号8位整数数组求和

使用Accelerate框架对无符号8位整数数组求和

本文介绍了使用Accelerate框架对无符号8位整数数组求和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我可以使用Accelerate Framework将无符号8位整数的数组求和,而不转换为浮点数组。

Can I use the Accelerate Framework to sum an array of unsigned 8-bit integers without converting to an array of floats.

我目前的方法是:

vDSP_vfltu8(intArray, 1, floatArray, 1, size);
vDSP_sve(floatArray, 1, &result, size);

但是vDSP_vfltu8很慢。

But vDSP_vfltu8 is quite slow.

推荐答案


  1. 如果重要的是您 vDSP_vfltu8()快,请提交错误报告。如果有任何问题,请提交错误报告。性能不足一个错误,如果您报告错误,则会被视为错误。图书馆作家使用这种反馈来确定如何确定工作的优先级;您的错误报告是在优化队列前面和队列中的#1937之间的区别。

  2. 正如已经暗示的,整数累加由于溢出关注而复杂化,但是如果对于由vDSP库提供的特定情况具有优化的函数将是有用的,请提交错误报告来请求这样的功能(注意模式?)。库编写器不是精神的,并且不编写没有请求的函数。一定要解释一下如何使用这样一个函数 - 给定这些信息,他们可能会得到一个稍微不同的函数,这对你更有用。

  3. 如果你决定写一些NEON代码自己,你会想使用 vaddw_u8()内在。

  1. If it is important to you that vDSP_vfltu8( ) be fast, please file a bug report. If there's any question, file a bug report. Inadequate performance is a bug, and will be treated as such if you report it. Library writers use this sort of feedback to determine how to prioritize their work; your bug report is the difference between a function being at the front of the queue for optimization and it being #1937 in the queue.
  2. As has been hinted, integer accumulation is complicated by overflow concerns, but if it would be useful to have an optimized function for a specific case provided by the vDSP library, please file a bug report to request such a function (noticing a pattern?). Library writers are not psychic, and do not write functions that are not requested. Be sure to explain how you would use such a function--given this information, they may come up with a slightly different function that is even more useful to you.
  3. If you decide to write some NEON code yourself, you will want to make use of the vaddw_u8( ) intrinsic.

这篇关于使用Accelerate框架对无符号8位整数数组求和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-31 11:15