本文介绍了最短可能的编码字符串,具有解码可能性(缩短URL),只使用PHP的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

限时删除!!

我正在寻找一种将字符串编码为最短可能长度的方法,并使其成为可解码(纯PHP,无SQL)。我有工作脚本,但我不满足编码字符串的长度。

I'm looking for a method that encodes an string to shortest possible length and lets it be decodable (pure PHP, no SQL). I have working script but I'm unsatisfied with length of the encoded string.

场景:

链接到图像(取决于我想向用户显示的文件分辨率):

Link to an image (depends on the file resolution I want to show to the user):


  • www.mysite.com/share/index.php?img=/dir/dir/hi-res-img.jpg&w=700&h=500

编码链接(所以用户不能猜到如何获取较大的图像):

Encoded link (so the user can't guess how to get the larger image):


  • www.mysite.com/share/encodedQUERYstring

所以,我只想编码url的搜索查询部分:

So, basicaly I'd like to encode only the search query part of the url:


  • img = / dir / dir / hi-res-img.jpg& w = 700& h = 500

我现在使用的方法将将上述查询字符串编码为:

The method I use right now will encode the above query string to:


  • y8xNt9VPySwC44xM3aLUYt3M3HS9rIJ0tXJbcwMDtQxbUwMDAA / strong>

  • y8xNt9VPySwC44xM3aLUYt3M3HS9rIJ0tXJbcwMDtQxbUwMDAA

我使用的是:

 $raw_query_string = 'img=/dir/dir/hi-res-img.jpg&w=700&h=500';

 $encoded_query_string = base64_encode(gzdeflate($raw_query_string));
 $decoded_query_string = gzinflate(base64_decode($encoded_query_string));

如何缩短编码结果,仍然可以使用 / strong> PHP?

How do I shorten the encoded result and still have the possibility to decode it using only PHP?

推荐答案

我怀疑,如果您不想要更多地了解您的哈希方法它可以被用户解码。 base64 的问题是base64字符串像base64字符串一样显示。有一个很好的机会,一个很精明的人来看你的页面来源可能会认出来。

I suspect that you will need to think more about your method of hashing if you don't want it to be decodable by the user. The issue with base64 is that a base64 string looks like a base64 string. There's a good chance that someone that's savvy enough to be looking at your page source will probably recognise it too.

第一部分:

如果你很灵活在你的URL上的词汇/字符,这将是一个很好的起点。由于gzip使用反向引用使其获得了很多收益,所以字符串很短就没有什么意义。

If you're flexible on your URL vocab/characters, this will be a good starting place. Since gzip makes a lot of its gains using back references, there is little point as the string is so short.

考虑你的例子 - 你只保存了2个字节压缩,在base64填充中再次丢失:

Consider your example - you've only saved 2 bytes in the compression, which are lost again in base64 padding:

非gzipped: string(52)aW1nPS9kaXIvZGlyL2hpLXJlcy1pbWcuanBnJnc9NzAwJmg9NTAw

Gzipped: string(52)y8xNt9VPySwC44xM3aLUYt3M3HS9rIJ0tXJbcwMDtQxbUwMDAA ==

如果您减少您的词汇大小,这将自然允许您更好的压缩。假设我们删除一些冗余信息

If you reduce your vocab size, this will naturally allow you better compression. Let's say we remove some redundant information

查看以下功能:

function compress($input, $ascii_offset = 38){
    $input = strtoupper($input);
    $output = '';
    //We can try for a 4:3 (8:6) compression (roughly), 24 bits for 4 chars
    foreach(str_split($input, 4) as $chunk) {
        $chunk = str_pad($chunk, 4, '=');

        $int_24 = 0;
        for($i=0; $i<4; $i++){
            //Shift the output to the left 6 bits
            $int_24 <<= 6;

            //Add the next 6 bits
            //Discard the leading ascii chars, i.e make
            $int_24 |= (ord($chunk[$i]) - $ascii_offset) & 0b111111;
        }

        //Here we take the 4 sets of 6 apart in 3 sets of 8
        for($i=0; $i<3; $i++) {
            $output = pack('C', $int_24) . $output;
            $int_24 >>= 8;
        }
    }

    return $output;
}

And

function decompress($input, $ascii_offset = 38) {

    $output = '';
    foreach(str_split($input, 3) as $chunk) {

        //Reassemble the 24 bit ints from 3 bytes
        $int_24 = 0;
        foreach(unpack('C*', $chunk) as $char) {
            $int_24 <<= 8;
            $int_24 |= $char & 0b11111111;
        }

        //Expand the 24 bits to 4 sets of 6, and take their character values
        for($i = 0; $i < 4; $i++) {
            $output = chr($ascii_offset + ($int_24 & 0b111111)) . $output;
            $int_24 >>= 6;
        }
    }

    //Make lowercase again and trim off the padding.
    return strtolower(rtrim($output, '='));
}

发生什么事情基本上是删除冗余信息,其次是压缩的4字节转换为3.这通过有效地具有ascii表的6位子集来实现。此窗口被移动,以便偏移量从有用的字符开始,并包含您当前正在使用的所有字符。

What's going on there is basically a removal of redundant information, followed by the compression of 4 bytes into 3. This is achieved by effectively having a 6-bit subset of the ascii table. This window is moved so that the offset starts at useful characters and includes all the characters you're currently using.

使用我使用的偏移量,您可以使用任何从ASCII 38到102.这给你一个结果字符串 30个字节,这是一个9字节(24%)的压缩!不幸的是,您需要使其URL安全(可能使用base64),将其备份到40个字节。

With the offset I've used, you can use anything from ASCII 38 to 102. This gives you a resulting string of 30 bytes, that's a 9-byte (24%) compression! Unfortunately, you'll need to make it URL-safe (probably with base64), which brings it back up to 40 bytes.

我认为在这一点上,假设您已经达到了99.9%的人所要求的安全防范级别,那么很安全。让我们继续,到您的问题的第二部分

I think at this point, you're pretty safe to assume that you've reached the "security through obscurity" level required to stop 99.9% of people. Let's continue though, to the second part of your question

可以说这是已经通过上面的解决了,但你需要做的就是将它通过服务器上的一个秘密,最好是用。以下代码显示了上述功能和加密的完整使用流程:

It's arguable that this is already solved with the above, but what you need to do is pass this through a secret on the server, preferably with php openssl. The following code shows the complete usage flow of functions above and the encryption:

$method = 'AES-256-CBC';
$secret = base64_decode('tvFD4Vl6Pu2CmqdKYOhIkEQ8ZO4XA4D8CLowBpLSCvA=');
$iv = base64_decode('AVoIW0Zs2YY2zFm5fazLfg==');

$input = 'img=/dir/dir/hi-res-img.jpg&w=700&h=500';
var_dump($input);

$compressed = compress($input);
var_dump($compressed);

$encrypted = openssl_encrypt($compressed, $method, $secret, false, $iv);
var_dump($encrypted);

$decrypted = openssl_decrypt($encrypted, $method, $secret, false, $iv);
var_dump($decrypted);

$decompressed = decompress($compressed);
var_dump($decompressed);

此脚本的输出如下:

string(39) "img=/dir/dir/hi-res-img.jpg&w=700&h=500"
string(30) "<��(��tJ��@�xH��G&(�%��%��xW"
string(44) "xozYGselci9i70cTdmpvWkrYvGN9AmA7djc5eOcFoAM="
string(30) "<��(��tJ��@�xH��G&(�%��%��xW"
string(39) "img=/dir/dir/hi-res-img.jpg&w=700&h=500"

您将看到整个周期:压缩>加密> base64编码/解码>解密>解压缩,其输出将尽可能接近你可以得到的最接近的最短距离。

You'll see the whole cycle: compression > encryption > base64 encode/decode > decryption > decompression. The output of this would be as close as possible as you could really get, at near the shortest length you could get.

除了一切,我不得不认为这只是理论上的事实,这是一个很好的挑战,有一些更好的方法可以实现你所期望的结果 - 我将首先承认我的解决方案是一点点有点荒唐!

Everything aside, I feel obliged to conclude this with the fact that it is theoretical only, and this was a nice challenge to think about. There are definitely better ways to achieve your desired result - I'll be the first to admit that my solution is a little bit absurd!

这篇关于最短可能的编码字符串,具有解码可能性(缩短URL),只使用PHP的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

1403页,肝出来的..

09-06 17:23