用最少的字符表示长

用最少的字符表示长

本文介绍了用最少的字符表示长的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要用尽可能短的字符串来表示非常大和非常小的数字.数字是无符号的.我尝试过直接 Base64 编码,但对于一些较小的数字,编码的字符串比将数字存储为字符串要长.在 URL 安全的情况下,在尽可能短的字符串中最优化地存储非常大或短的数字的最佳方法是什么?

I need to represent both very large and small numbers in the shortest string possible. The numbers are unsigned. I have tried just straight Base64 encode, but for some smaller numbers, the encoded string is longer than just storing the number as a string. What would be the best way to most optimally store a very large or short number in the shortest string possible with it being URL safe?

推荐答案

二进制字节数据的 Base64 编码将使其更长,大约三分之一.它不应该使其更短,而是允许以非二进制安全的格式安全传输二进制数据.

Base64 encoding of binary byte data will make it longer, by about a third. It is not supposed to make it shorter, but to allow safe transport of binary data in formats that are not binary safe.

然而,base 64 比十进制表示一个数字(或字节数据)更紧凑,即使它不如 base 256(原始字节数据)紧凑.直接以 base 64 编码您的数字将使它们比十进制更紧凑.这将做到:

However, base 64 is more compact than decimal representation of a number (or of byte data), even if it is less compact than base 256 (the raw byte data). Encoding your numbers in base 64 directly will make them more compact than decimal. This will do it:

private static final String base64Chars =
    "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_";

static String encodeNumber(long x) {
    char[] buf = new char[11];
    int p = buf.length;
    do {
        buf[--p] = base64Chars.charAt((int)(x % 64));
        x /= 64;
    } while (x != 0);
    return new String(buf, p, buf.length - p);
}

static long decodeNumber(String s) {
    long x = 0;
    for (char c : s.toCharArray()) {
        int charValue = base64Chars.indexOf(c);
        if (charValue == -1) throw new NumberFormatException(s);
        x *= 64;
        x += charValue;
    }
    return x;
}

使用此编码方案,Long.MAX_VALUE 将是字符串 H__________,与其十进制表示相比,长度为 11 个字符9223372036854775807长度为 19 个字符.多达约 1600 万的数字仅可容纳 4 个字符.这就是你所能得到的.(从技术上讲,还有两个不需要在 URL 中编码的字符:.~.您可以合并这些以获得基数 66,这对于某些数字来说会更短一些,尽管这看起来有点迂腐.)

Using this encoding scheme, Long.MAX_VALUE will be the string H__________, which is 11 characters long, compared to its decimal representation 9223372036854775807 which is 19 characters long. Numbers up to about 16 million will fit in a mere 4 characters. That's about as short as you'll get it. (Technically there are two other characters which do not need to be encoded in URLs: . and ~. You can incorporate those to get base 66, which would be a smidgin shorter for some numbers, although that seems a bit pedantic.)

这篇关于用最少的字符表示长的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-30 05:57