问题描述
我正在寻找一种将字符串编码为最短可能长度的方法,并使其成为可解码(纯PHP,无SQL)。我有工作脚本,但我不满足编码字符串的长度。
I'm looking for a method that encodes an string to shortest possible length and lets it be decodable (pure PHP, no SQL). I have working script but I'm unsatisfied with length of the encoded string.
场景:
链接到图像(取决于我想向用户显示的文件分辨率):
Link to an image (depends on the file resolution I want to show to the user):
- www.mysite.com/share/index.php?img=/dir/dir/hi-res-img.jpg&w=700&h=500
编码链接(所以用户不能猜到如何获取较大的图像):
Encoded link (so the user can't guess how to get the larger image):
- www.mysite.com/share/encodedQUERYstring
所以,我只想编码url的搜索查询部分:
So, basicaly I'd like to encode only the search query part of the url:
- img = / dir / dir / hi-res-img.jpg& w = 700& h = 500
我现在使用的方法将将上述查询字符串编码为:
The method I use right now will encode the above query string to:
- y8xNt9VPySwC44xM3aLUYt3M3HS9rIJ0tXJbcwMDtQxbUwMDAA / strong>
- y8xNt9VPySwC44xM3aLUYt3M3HS9rIJ0tXJbcwMDtQxbUwMDAA
我使用的是:
$raw_query_string = 'img=/dir/dir/hi-res-img.jpg&w=700&h=500';
$encoded_query_string = base64_encode(gzdeflate($raw_query_string));
$decoded_query_string = gzinflate(base64_decode($encoded_query_string));
如何缩短编码结果,仍然可以使用 / strong> PHP?
How do I shorten the encoded result and still have the possibility to decode it using only PHP?
推荐答案
我怀疑,如果您不想要更多地了解您的哈希方法它可以被用户解码。 base64
的问题是base64字符串像base64字符串一样显示。有一个很好的机会,一个很精明的人来看你的页面来源可能会认出来。
I suspect that you will need to think more about your method of hashing if you don't want it to be decodable by the user. The issue with base64
is that a base64 string looks like a base64 string. There's a good chance that someone that's savvy enough to be looking at your page source will probably recognise it too.
第一部分:
如果你很灵活在你的URL上的词汇/字符,这将是一个很好的起点。由于gzip使用反向引用使其获得了很多收益,所以字符串很短就没有什么意义。
If you're flexible on your URL vocab/characters, this will be a good starting place. Since gzip makes a lot of its gains using back references, there is little point as the string is so short.
考虑你的例子 - 你只保存了2个字节压缩,在base64填充中再次丢失:
Consider your example - you've only saved 2 bytes in the compression, which are lost again in base64 padding:
非gzipped: string(52)aW1nPS9kaXIvZGlyL2hpLXJlcy1pbWcuanBnJnc9NzAwJmg9NTAw
Gzipped: string(52)y8xNt9VPySwC44xM3aLUYt3M3HS9rIJ0tXJbcwMDtQxbUwMDAA ==
如果您减少您的词汇大小,这将自然允许您更好的压缩。假设我们删除一些冗余信息
If you reduce your vocab size, this will naturally allow you better compression. Let's say we remove some redundant information
查看以下功能:
function compress($input, $ascii_offset = 38){
$input = strtoupper($input);
$output = '';
//We can try for a 4:3 (8:6) compression (roughly), 24 bits for 4 chars
foreach(str_split($input, 4) as $chunk) {
$chunk = str_pad($chunk, 4, '=');
$int_24 = 0;
for($i=0; $i<4; $i++){
//Shift the output to the left 6 bits
$int_24 <<= 6;
//Add the next 6 bits
//Discard the leading ascii chars, i.e make
$int_24 |= (ord($chunk[$i]) - $ascii_offset) & 0b111111;
}
//Here we take the 4 sets of 6 apart in 3 sets of 8
for($i=0; $i<3; $i++) {
$output = pack('C', $int_24) . $output;
$int_24 >>= 8;
}
}
return $output;
}
And
function decompress($input, $ascii_offset = 38) {
$output = '';
foreach(str_split($input, 3) as $chunk) {
//Reassemble the 24 bit ints from 3 bytes
$int_24 = 0;
foreach(unpack('C*', $chunk) as $char) {
$int_24 <<= 8;
$int_24 |= $char & 0b11111111;
}
//Expand the 24 bits to 4 sets of 6, and take their character values
for($i = 0; $i < 4; $i++) {
$output = chr($ascii_offset + ($int_24 & 0b111111)) . $output;
$int_24 >>= 6;
}
}
//Make lowercase again and trim off the padding.
return strtolower(rtrim($output, '='));
}
发生什么事情基本上是删除冗余信息,其次是压缩的4字节转换为3.这通过有效地具有ascii表的6位子集来实现。此窗口被移动,以便偏移量从有用的字符开始,并包含您当前正在使用的所有字符。
What's going on there is basically a removal of redundant information, followed by the compression of 4 bytes into 3. This is achieved by effectively having a 6-bit subset of the ascii table. This window is moved so that the offset starts at useful characters and includes all the characters you're currently using.
使用我使用的偏移量,您可以使用任何从ASCII 38到102.这给你一个结果字符串 30个字节,这是一个9字节(24%)的压缩!不幸的是,您需要使其URL安全(可能使用base64),将其备份到40个字节。
With the offset I've used, you can use anything from ASCII 38 to 102. This gives you a resulting string of 30 bytes, that's a 9-byte (24%) compression! Unfortunately, you'll need to make it URL-safe (probably with base64), which brings it back up to 40 bytes.
我认为在这一点上,假设您已经达到了99.9%的人所要求的安全防范级别,那么很安全。让我们继续,到您的问题的第二部分
I think at this point, you're pretty safe to assume that you've reached the "security through obscurity" level required to stop 99.9% of people. Let's continue though, to the second part of your question
可以说这是已经通过上面的解决了,但你需要做的就是将它通过服务器上的一个秘密,最好是用。以下代码显示了上述功能和加密的完整使用流程:
It's arguable that this is already solved with the above, but what you need to do is pass this through a secret on the server, preferably with php openssl. The following code shows the complete usage flow of functions above and the encryption:
$method = 'AES-256-CBC';
$secret = base64_decode('tvFD4Vl6Pu2CmqdKYOhIkEQ8ZO4XA4D8CLowBpLSCvA=');
$iv = base64_decode('AVoIW0Zs2YY2zFm5fazLfg==');
$input = 'img=/dir/dir/hi-res-img.jpg&w=700&h=500';
var_dump($input);
$compressed = compress($input);
var_dump($compressed);
$encrypted = openssl_encrypt($compressed, $method, $secret, false, $iv);
var_dump($encrypted);
$decrypted = openssl_decrypt($encrypted, $method, $secret, false, $iv);
var_dump($decrypted);
$decompressed = decompress($compressed);
var_dump($decompressed);
此脚本的输出如下:
string(39) "img=/dir/dir/hi-res-img.jpg&w=700&h=500"
string(30) "<��(��tJ��@�xH��G&(�%��%��xW"
string(44) "xozYGselci9i70cTdmpvWkrYvGN9AmA7djc5eOcFoAM="
string(30) "<��(��tJ��@�xH��G&(�%��%��xW"
string(39) "img=/dir/dir/hi-res-img.jpg&w=700&h=500"
您将看到整个周期:压缩>加密> base64编码/解码>解密>解压缩,其输出将尽可能接近你可以得到的最接近的最短距离。
You'll see the whole cycle: compression > encryption > base64 encode/decode > decryption > decompression. The output of this would be as close as possible as you could really get, at near the shortest length you could get.
除了一切,我不得不认为这只是理论上的事实,这是一个很好的挑战,有一些更好的方法可以实现你所期望的结果 - 我将首先承认我的解决方案是一点点有点荒唐!
Everything aside, I feel obliged to conclude this with the fact that it is theoretical only, and this was a nice challenge to think about. There are definitely better ways to achieve your desired result - I'll be the first to admit that my solution is a little bit absurd!
这篇关于最短可能的编码字符串,具有解码可能性(缩短URL),只使用PHP的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!