本文介绍了参数化和“不允许功能模板部分专门化”的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是在原始问题中,我们试图加速 - 在Clang和VC ++下执行移位和旋转的一些代码。 Clang和VC ++不会优化代码,因为它将shift / rotate量视为变量(即不是 constexpr )。



当我尝试参数化移位量和字大小时,会导致:

  $ g ++ -std = c ++ 11 -march = native test.cxx -o test.exe 
test.cxx:13:10:error:不允许函数模板部分专门化
uint32_t LeftRotate< uint32_t,unsigned int> uint32_t v)
^ ~~~~~~~~~~~~~~~~~~~~~~~~
test.cxx:21:10:错误:功能模板部分专业化不允许
uint64_t LeftRotate< uint64_t,unsigned int>(uint64_t v)
^ ~~~~~~~~~~~~~~~~~~~
生成了2个错误。

这里是测试程序。它比所需要的大一点,所以人们可以看到我们需要处理 uint32_t uint64_t (更不要说 uint8_t uint16_t 和其他类型)。

  $ cat test.cxx 
#include< iostream>
#include< stdint.h>

template< typename T,unsigned int R>
inline T LeftRotate(unsigned int v)
{
static const unsigned int THIS_SIZE = sizeof(T)* 8;
static const unsigned int MASK = THIS_SIZE-1;
return T((< R)|(v>( - R& MASK)))
};

template< uint32_t,unsigned int R>
uint32_t LeftRotate< uint32_t,unsigned int>(uint32_t v)
{
__asm__(roll%1,%0:+ mq(v):I char)R));
return v;
}

#if __x86_64__
模板< uint64_t,unsigned int R>
uint64_t LeftRotate< uint64_t,unsigned int>(uint64_t v)
{
__asm__(rolq%1,%0:+ mq(v):J char)R));
return v;
}
#endif

int main(int argc,char * argv [])
{
std :: cout< Rotated:<< LeftRotate< uint32_t,2>((uint32_t)argc)< std :: endl;
return 0;
}



我已经通过多次错误消息的迭代,取决于我如何尝试实现旋转。 Othr错误消息包括 没有函数模板匹配函数模板专门化... 。使用模板<> 似乎产生了最不可理解的一个。



如何参数化希望Clang和VC ++能够按照预期优化函数调用?

解决方案

另一种方法是将模板常量转换为常量参数

步骤1:定义rotate_distance的概念:

  template< unsigned int R> using rotate_distance = std :: integral_constant< unsigned int,R> ;; 

步骤2:根据函数的重载定义rotate函数, :

 模板< unsigned int R> 
uint32_t LeftRotate(uint32_t v,rotate_distance< R>)

现在,可以简单地调用 LeftRotate(x,rotate_distance< y>()),这似乎很好地表达了意图,



  template< unsigned int Dist,class T> 
T LeftRotate(T t)
{
return LeftRotate(t,rotate_distance< Dist>());
}

完整演示:

  #include< iostream> 
#include< stdint.h>
#include< utility>

template< unsigned int R> using rotate_distance = std :: integral_constant< unsigned int,R> ;;

template< typename T,unsigned int R>
inline T LeftRotate(unsigned int v,rotate_distance< R>)
{
static const unsigned int THIS_SIZE = sizeof(T)* 8;
static const unsigned int MASK = THIS_SIZE-1;
return T((< R)|(v>( - R& MASK)))
}

template< unsigned int R>
uint32_t LeftRotate(uint32_t v,rotate_distance< R>)
{
__asm__(roll%1,%0:+ mq(v):I )R));
return v;
}

#if __x86_64__
template< unsigned int R>
uint64_t LeftRotate(uint64_t v,rotate_distance< R>)
{
__asm__(rolq%1,%0:+ mq(v):J )R));
return v;
}
#endif


模板< unsigned int Dist,class T>
T LeftRotate(T t)
{
return LeftRotate(t,rotate_distance< Dist>());
}

int main(int argc,char * argv [])
{
std :: cout< Rotated:<< LeftRotate((uint32_t)argc,rotate_distance 2())< std :: endl;
std :: cout<< Rotated:<< LeftRotate((uint64_t)argc,rotate_distance 2())< std :: endl;
std :: cout<< Rotated:<< LeftRotate 2((uint64_t)argc)< std :: endl;
return 0;
}



pre-c ++ 11编译器



在c ++ 11之前,我们没有std :: integral_constant,所以我们必须创建自己的版本。



这是足够的:

  template< unsigned int R> struct rotate_distance {}; 

完整证明 - 注意优化的效果:




This is a continuation of What is the function parameter equivalent of constexpr? In the original question, we are trying to speed-up some code that performs shifts and rotates under Clang and VC++. Clang and VC++ does not optimize the code well because it treats the shift/rotate amount as variable (i.e., not constexpr).

When I attempt to parameterize the shift amount and the word size, it results in:

$ g++ -std=c++11 -march=native test.cxx -o test.exe
test.cxx:13:10: error: function template partial specialization is not allowed
uint32_t LeftRotate<uint32_t, unsigned int>(uint32_t v)
         ^         ~~~~~~~~~~~~~~~~~~~~~~~~
test.cxx:21:10: error: function template partial specialization is not allowed
uint64_t LeftRotate<uint64_t, unsigned int>(uint64_t v)
         ^         ~~~~~~~~~~~~~~~~~~~~~~~~
2 errors generated.

Here's the test program. Its a tad bit larger than needed so folks can see we need to handle both uint32_t and uint64_t (not to mention uint8_t, uint16_t and other types).

$ cat test.cxx
#include <iostream>
#include <stdint.h>

template<typename T, unsigned int R>
inline T LeftRotate(unsigned int v)
{
  static const unsigned int THIS_SIZE = sizeof(T)*8;
  static const unsigned int MASK = THIS_SIZE-1;
  return T((v<<R)|(v>>(-R&MASK)));
};

template<uint32_t, unsigned int R>
uint32_t LeftRotate<uint32_t, unsigned int>(uint32_t v)
{
  __asm__ ("roll %1, %0" : "+mq" (v) : "I" ((unsigned char)R));
  return v;
}

#if __x86_64__
template<uint64_t, unsigned int R>
uint64_t LeftRotate<uint64_t, unsigned int>(uint64_t v)
{
  __asm__ ("rolq %1, %0" : "+mq" (v) : "J" ((unsigned char)R));
  return v;
}
#endif

int main(int argc, char* argv[])
{
  std::cout << "Rotated: " << LeftRotate<uint32_t, 2>((uint32_t)argc) << std::endl;
  return 0;
}

I've been through a number of iterations of error messages depending on how I attempt to implement the rotate. Othr error messages include no function template matches function template specialization.... Using template <> seems to produce the most incomprehensible one.

How do I parameterize the shift amount in hopes that Clang and VC++ will optimize the function call as expected?

解决方案

Another way is to turn the templated constant into a constant argument which the compiler can optimise away.

step 1: define the concept of a rotate_distance:

template<unsigned int R> using rotate_distance = std::integral_constant<unsigned int, R>;

step 2: define the rotate functions in terms of overloads of a function which takes an argument of this type:

template<unsigned int R>
uint32_t LeftRotate(uint32_t v, rotate_distance<R>)

Now, if we wish we can simply call LeftRotate(x, rotate_distance<y>()), which seems to express intent nicely,

or we can now redefine the 2-argument template form in terms of this form:

template<unsigned int Dist, class T>
T LeftRotate(T t)
{
  return LeftRotate(t, rotate_distance<Dist>());
}

Full Demo:

#include <iostream>
#include <stdint.h>
#include <utility>

template<unsigned int R> using rotate_distance = std::integral_constant<unsigned int, R>;

template<typename T, unsigned int R>
inline T LeftRotate(unsigned int v, rotate_distance<R>)
{
  static const unsigned int THIS_SIZE = sizeof(T)*8;
  static const unsigned int MASK = THIS_SIZE-1;
  return T((v<<R)|(v>>(-R&MASK)));
}

template<unsigned int R>
uint32_t LeftRotate(uint32_t v, rotate_distance<R>)
{
  __asm__ ("roll %1, %0" : "+mq" (v) : "I" ((unsigned char)R));
  return v;
}

#if __x86_64__
template<unsigned int R>
uint64_t LeftRotate(uint64_t v, rotate_distance<R>)
{
  __asm__ ("rolq %1, %0" : "+mq" (v) : "J" ((unsigned char)R));
  return v;
}
#endif


template<unsigned int Dist, class T>
T LeftRotate(T t)
{
  return LeftRotate(t, rotate_distance<Dist>());
}

int main(int argc, char* argv[])
{
  std::cout << "Rotated: " << LeftRotate((uint32_t)argc, rotate_distance<2>()) << std::endl;
  std::cout << "Rotated: " << LeftRotate((uint64_t)argc, rotate_distance<2>()) << std::endl;
  std::cout << "Rotated: " << LeftRotate<2>((uint64_t)argc) << std::endl;
  return 0;
}

pre-c++11 compilers

Prior to c++11 we didn't have std::integral_constant, so we have to make our own version.

For our purposes, this is sufficient:

template<unsigned int R> struct rotate_distance {};

full proof - note the effect of optimisations:

https://godbolt.org/g/p4tsQ5

这篇关于参数化和“不允许功能模板部分专门化”的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-22 12:31