本文介绍了二进制字符串为十六进制的c ++的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在改变一个二进制字符串为十六进制,我只能这样做是为了根据离我找到了答案一定规模。但我想改变MASSIVE二进制字符串到他们的完整的十六进制对应在比这更有效的方式是我遇到的唯一方法做它完全:

When changing a Binary string to Hex, I could only do it to a certain size based off of the answers I found. But I want to change MASSIVE Binary strings into their complete Hex counterpart in a more efficient way than this which is the only way I've come across that does it completely:

for(size_t i = 0; i < (binarySubVec.size() - 1); i++){
    string binToHex, tmp = "0000";
    for (size_t j = 0; j < binaryVecStr[i].size(); j += 4){
        tmp = binaryVecStr[i].substr(j, 4);
        if      (!tmp.compare("0000")) binToHex += "0";
        else if (!tmp.compare("0001")) binToHex += "1";
        else if (!tmp.compare("0010")) binToHex += "2";
        else if (!tmp.compare("0011")) binToHex += "3";
        else if (!tmp.compare("0100")) binToHex += "4";
        else if (!tmp.compare("0101")) binToHex += "5";
        else if (!tmp.compare("0110")) binToHex += "6";
        else if (!tmp.compare("0111")) binToHex += "7";
        else if (!tmp.compare("1000")) binToHex += "8";
        else if (!tmp.compare("1001")) binToHex += "9";
        else if (!tmp.compare("1010")) binToHex += "A";
        else if (!tmp.compare("1011")) binToHex += "B";
        else if (!tmp.compare("1100")) binToHex += "C";
        else if (!tmp.compare("1101")) binToHex += "D";
        else if (!tmp.compare("1110")) binToHex += "E";
        else if (!tmp.compare("1111")) binToHex += "F";
        else continue;
    }
    hexOStr << binToHex;
    hexOStr << " ";
}

其彻底的,绝对的,但是进展缓慢。

Its thorough and absolute, but slow.

是否有这样做的一个简单的方法?

Is there a simpler way of doing this?

推荐答案

更新末添加比较和基准

下面是另一个取的基础上,完美的哈希值。使用生成完美的哈希的gperf (如描述如下:<一href=\"http://stackoverflow.com/questions/16141178/is-it-possible-to-map-string-to-int-faster-than-using-hashmap/16141214?s=5|Is可以映射字符串为int比使用HashMap的快?)。

Here's another take, based on a perfect hash. The perfect hash was generated using gperf (like described here: Is it possible to map string to int faster than using hashmap?).

我进一步通过移动功能的本地静态出的方式和标记 hexdigit()散列() constexpr 。这消除任何不必要的初始化开销使编译器全面优化空间/

I've further optimized by moving function local statics out of the way and marking hexdigit() and hash() as constexpr. This removes unnecessary any initialization overhead and gives the compiler full room for optimization/

您的可能的尝试阅读例如1024半位如果可能的话一时间,并给编译器有机会向量化使用AVX / SSE指令集的操作。的(我没有检查生成的code,看是否会发生这种事。)

You could try reading e.g. 1024 nibbles at a time if possible, and give the compiler a chance to vectorize the operations using AVX/SSE instruction sets. (I have not inspected the generated code to see whether this would happen.)

全样本code翻译流模式的std :: CIN 的std :: COUT 是:

The full sample code to translate std::cin to std::cout in streaming mode is:

#include <iostream>

int main()
{
    char buffer[4096];
    while (std::cin.read(buffer, sizeof(buffer)), std::cin.gcount())
    {
        size_t got = std::cin.gcount();
        char* out = buffer;

        for (auto it = buffer; it < buffer+got; it += 4)
            *out++ = Perfect_Hash::hexchar(it);

        std::cout.write(buffer, got/4);
    }
}

这里的 Perfect_Hash 类,稍有删节,并与 hexchar 查找延长。请注意,它验证输入 DEBUG 建立使用断言

Here's the Perfect_Hash class, slightly redacted and extended with the hexchar lookup. Note that it does validate input in DEBUG builds using the assert:

#include <array>
#include <algorithm>
#include <cassert>

class Perfect_Hash {
    /* C++ code produced by gperf version 3.0.4 */
    /* Command-line: gperf -L C++ -7 -C -E -m 100 table  */
    /* Computed positions: -k'1-4' */

    /* maximum key range = 16, duplicates = 0 */
  private:
      static constexpr unsigned char asso_values[] = {
          27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27,
          27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 15, 7,  3,  1,  0,  27,
          27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27,
          27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27,
          27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27};
      template <typename It>
      static constexpr unsigned int hash(It str)
      {
          return
              asso_values[(unsigned char)str[3] + 2] + asso_values[(unsigned char)str[2] + 1] +
              asso_values[(unsigned char)str[1] + 3] + asso_values[(unsigned char)str[0]];
      }

      static constexpr char hex_lut[] = "???????????fbead9c873625140";
  public:
#ifdef DEBUG
    template <typename It>
    static char hexchar(It binary_nibble)
    {
        assert(Perfect_Hash::validate(binary_nibble)); // for DEBUG only
        return hex_lut[hash(binary_nibble)]; // no validation!
    }
#else
    template <typename It>
    static constexpr char hexchar(It binary_nibble)
    {
        return hex_lut[hash(binary_nibble)]; // no validation!
    }
#endif
    template <typename It>
    static bool validate(It str)
    {
        static constexpr std::array<char, 4> vocab[] = {
            {{'?', '?', '?', '?'}}, {{'?', '?', '?', '?'}}, {{'?', '?', '?', '?'}},
            {{'?', '?', '?', '?'}}, {{'?', '?', '?', '?'}}, {{'?', '?', '?', '?'}},
            {{'?', '?', '?', '?'}}, {{'?', '?', '?', '?'}}, {{'?', '?', '?', '?'}},
            {{'?', '?', '?', '?'}}, {{'?', '?', '?', '?'}},
            {{'1', '1', '1', '1'}}, {{'1', '0', '1', '1'}},
            {{'1', '1', '1', '0'}}, {{'1', '0', '1', '0'}},
            {{'1', '1', '0', '1'}}, {{'1', '0', '0', '1'}},
            {{'1', '1', '0', '0'}}, {{'1', '0', '0', '0'}},
            {{'0', '1', '1', '1'}}, {{'0', '0', '1', '1'}},
            {{'0', '1', '1', '0'}}, {{'0', '0', '1', '0'}},
            {{'0', '1', '0', '1'}}, {{'0', '0', '0', '1'}},
            {{'0', '1', '0', '0'}}, {{'0', '0', '0', '0'}},
        };
        int key = hash(str);

        if (key <= 26 && key >= 0)
            return std::equal(str, str+4, vocab[key].begin());
        else
            return false;
    }
};

constexpr unsigned char Perfect_Hash::asso_values[];
constexpr char Perfect_Hash::hex_lut[];

#include <iostream>

int main()
{
    char buffer[4096];
    while (std::cin.read(buffer, sizeof(buffer)), std::cin.gcount())
    {
        size_t got = std::cin.gcount();
        char* out = buffer;

        for (auto it = buffer; it < buffer+got; it += 4)
            *out++ = Perfect_Hash::hexchar(it);

        std::cout.write(buffer, got/4);
    }
}

例如用于演示输出 OD -A -t没有O /开发/ urandom的| TR-CD'01'| DD BS = 1数= 4096 | ./test

03bef5fb79c7da917e3ebffdd8c41488d2b841dac86572cf7672d22f1f727627a2c4a48b15ef27eb0854dd99756b24c678e3b50022d695cc5f5c8aefaced2a39241bfd5deedcfa0a89060598c6b056d934719eba9ccf29e430d2def5751640ff17860dcb287df8a94089ade0283ee3d76b9fefcce3f3006b8c71399119423e780cef81e9752657e97c7629a9644be1e7c96b5d0324ab16d20902b55bb142c0451e675973489ae4891ec170663823f9c1c9b2a11fcb1c39452aff76120b21421069af337d14e89e48ee802b1cecd8d0886a9a0e90dea5437198d8d0d7ef59c46f9a069a83835286a9a8292d2d7adb4e7fb0ef42ad4734467063d181745aaa6694215af7430f95e854b7cad813efbbae0d2eb099523f215cff6d9c45e3edcaf63f78a485af8f2bfc2e27d46d61561b155d619450623b7aa8ca085c6eedfcc19209066033180d8ce1715e8ec9086a7c28df6e4202ee29705802f0c2872fbf06323366cf64ecfc5ea6f15ba6467730a8856a1c9ebf8cc188e889e783c50b85824803ed7d7505152b891cb2ac2d6f4d1329e100a2e3b2bdd50809b48f0024af1b5092b35779c863cd9c6b0b8e278f5bec966dd0e5c4756064cca010130acf24071d02de39ef8ba8bd1b6e9681066be3804d36ca83e7032274e4c8e8cacf520e8078f8fa80eb8e70af40367f53e53a7d7f7afe8704c46f58339d660b8151c91bddf82b4096

我想出了三种不同的方法:

BENCHMARKS

I came up with three different approaches:


  1. ;现场拆装

  2. ; 引擎收录
  3. rel=\"nofollow\">
  4. 和的基础;现场拆装

  1. naive.cpp (no hacks, no libraries); Live disassembly on Godbolt
  2. spirit.cpp (Trie); disassembly on pastebin
  3. and this answer: perfect.cpp hash based; Live disassembly on Godbolt

为了做一些比较,我有


  • 与相同的编译器(GCC 4.9)和标志编译所有这些( -O3 -march =本地-g0 -DNDEBUG

  • 优化的输入/输出,因此它不会被4字符/读写单个字符

  • 创建一个大的输入文件(1千兆字节)

下面是结果:


  • 出人意料的是,从第一个答案幼稚办法做的比较好

  • 精神确实实在太差了这里;它网3.4MB /秒,使整个文件将需要294秒(!!!)。我们已经把它关闭的图表

  • 的平均吞吐量为720MB〜/ s的 naive.cpp 并〜1.14GB / s的 perfect.cpp

  • 这使得完美的哈希方法不是天真的方法快50%左右。

  • Surprisingly, the naive approach from the first answer does rather well
  • Spirit does really badly here; it nets 3.4MB/s so that the whole file would take at 294 seconds (!!!). We've left it off the charts
  • The average throughputs are ~720MB/s for naive.cpp and ~1.14GB/s forperfect.cpp
  • This makes the perfect hash approach roughly 50% faster than the naive approach.

*摘要我说天真的方法是很多不错的突发奇想7小时前。如果你真的想要高吞吐量,完美哈希是一个很好的开端,但考虑手卷基于SIMD的解决方案

这篇关于二进制字符串为十六进制的c ++的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 09:29