问题描述
JavaScript中对gzip的支持非常薄弱.所有浏览器都将其实现为支持Content-encoding:gzip标头,但没有对浏览器的gzip/gunzip功能的标准访问权限.因此,必须使用仅JavaScript的方法.周围有一些旧的gzip-js库,但是它们似乎未启用流并且需要6年的维护.
Support for gzip in JavaScript is surprisingly weak. All browsers implement it for supporting the Content-encoding: gzip header, but there is no standard access to the gzip / gunzip function of the browser. So one must use a javascript only approach. There are some old gzip-js libraries around, but they don't seem stream-enabled and 6 years out of maintenance.
然后有个pako,维护得更积极,但是如果使用自己的发行版,也不会启用流,因此您需要将整个二进制数组和gzip输出保留在内存中.我可能是错的,但这就是我正在收集的内容.
Then there is pako, more actively maintained, but that also doesn't seen stream enabled if using their own distribution, so you need to hold the entire binary array and the gzip output in memory. I might be wrong, but that is what I am gathering.
JSZip是一个精心设计的工具,并且支持流"Workers". JSZip使用pako. ZIP条目已被defdeflated,并且具有与gzip一样的CRC32校验和,当然,它们的组织方式略有不同.仅仅从考虑JSZip的源代码看,似乎很容易将pako的gzip压缩选项公开到JSZip的流支持中.如果我同时使用JSZip和gzip,为什么还要两次加载pako?
JSZip is a well designed tool and has support for streams "Workers". JSZip uses pako. ZIP entries are DEFLATEd and have a CRC32 checksum just like gzip, only slightly differently organized of course. Just from contemplating the JSZip sources, it looks like it could be easy to expose the gzip compression option of pako into the stream support of JSZip. And if I use both JSZip and also need gzip, why would I want to load pako twice?
我希望我可以闯入JSZip内部访问底层的Workers,并使用基于pako的"Flate"(即in-flate/de-flate)实现以及pako可以识别的gzip选项.使用Chrome javascript控制台对其进行了探索,但我无法通过.可分发的可加载jszip.js或jszip-min.js隐藏了所有内部内容,无法访问脚本.我不能打开那个盒子.
I was hoping I could just hack my way through into the internals of JSZip to the underlying Workers and using the pako based "Flate" (i.e., in-flate / de-flate) implementation with the gzip option recognized by pako. Explored it with the Chrome javascript console, but I can't get through. The distributable loadable jszip.js or jszip-min.js are hiding all the internals from access to scripts. I cannot open that box.
因此,我一直在查看git hub源代码,以查看是否可以构建自己的jszip.js或jszip-min.js可加载模块,在其中我可以导出更多内部资源以供页面使用,但是从事此工作已有20年了,UNIX制作文件,蚂蚁,所有东西,当我想到打包javascript模块的这些技巧时,我感觉像是一个新手,我看到Bower和"gruntfiles"似乎都与node.js相关,我不需要的(仅客户端浏览器)并且从未使用过,所以我不知道从哪里开始.
So I have been looking at the git hub source code to see if I could build my own jszip.js or jszip-min.js loadable module where I would export more of the internal resources for use in my page, but having been in this for 20 years, UNIX make files, ant, everything, I feel like a complete novice when it comes to these tricks to packaging javascript modules and I see bower and "gruntfiles" which all seem to be related to node.js, which I don't need (only client-side browser) and never worked with, so I have no idea where to start.
推荐答案
正如Evert所说,我应该首先检查文档 https://stuk.github.io/jszip/documentation/contributing.html .
As Evert was saying, I should have checked first for the build instructions in the documentation https://stuk.github.io/jszip/documentation/contributing.html.
很明显,第一个需要git并进行本地克隆.然后需要设置grunt命令行,这需要nodejs附带的npm.一旦grunt运行,还有其他依赖项需要npm安装.这是平常的小事,无法正常工作,但是有足够的谷歌搜索和蛮力尝试将其完成.
From that it is clear, first one needs git and makes a local clone. Then one needs to set up the grunt command line, which requires, npm, which comes with nodejs. Once grunt runs, there are other dependencies that need to be npm install-ed. It's the usual little things off and not working, but enough Googling and brute force retrying to get it done.
现在jszip/lib/index.js包含最终导出的资源.就是那个JSZip对象.因此,仅是为了处理内部内容,我可以将它们添加到JSZip对象中,例如,它已经包含:
Now jszip/lib/index.js contains the resource that is finally exported. It is that JSZip object. So just to play with the internal stuff, I could add these to the JSZip object, for example, it already contains:
JSZip.external = require("./external");
module.exports = JSZip;
,因此我们可以轻松添加要使用的其他资源:
and so we can easily add other resources we want to play with:
JSZip.flate = require("./flate");
JSZip.DataWorker = require('./stream/DataWorker');
JSZip.DataLengthProbe = require('./stream/DataLengthProbe');
JSZip.Crc32Probe = require('./stream/Crc32Probe');
JSZip.StreamHelper = require('./stream/StreamHelper');
JSZip.pako = require("pako");
现在,我可以在Chrome调试器中创建概念证明:
Now with that, I can create a proof of concept in the Chrome debugger:
(new JSZip.StreamHelper(
(new JSZip.DataWorker(Promise.resolve("Hello World! Hello World! Hello World! Hello World! Hello World! Hello World!")))
.pipe(new JSZip.DataLengthProbe("uncompressedSize"))
.pipe(new JSZip.Crc32Probe())
.pipe(JSZip.flate.compressWorker({}))
.pipe(new JSZip.DataLengthProbe("compressedSize"))
.on("end", function(event) { console.log("onEnd: ", this.streamInfo) }),
"uint8array", "")
).accumulate(function(data) { console.log("acc: ", data); })
.then(function(data) { console.log("then: ", data); })
,这有效.我一直在用gzip标头和预告片制作自己的GZipFileStream,正确地创建了所有内容.我将jszip/lib/generate/GZipFileWorker.js输入如下:
and this works. I have been making myself a GZipFileStream with gzip header and trailer, creating everything correctly. I put a jszip/lib/generate/GZipFileWorker.js in as follows:
'use strict';
var external = require('../external');
var utils = require('../utils');
var flate = require('../flate');
var GenericWorker = require('../stream/GenericWorker');
var DataWorker = require('../stream/DataWorker');
var StreamHelper = require('../stream/StreamHelper');
var DataLengthProbe = require('../stream/DataLengthProbe');
var Crc32Probe = require('../stream/Crc32Probe');
function GZipFileWorker() {
GenericWorker.call(this, "GZipFileWorker");
this.virgin = true;
}
utils.inherits(GZipFileWorker, GenericWorker);
GZipFileWorker.prototype.processChunk = function(chunk) {
if(this.virgin) {
this.virgin = false;
var headerBuffer = new ArrayBuffer(10);
var headerView = new DataView(headerBuffer);
headerView.setUint16(0, 0x8b1f, true); // GZip magic
headerView.setUint8(2, 0x08); // compression algorithm DEFLATE
headerView.setUint8(3, 0x00); // flags
// bit 0 FTEXT
// bit 1 FHCRC
// bit 2 FEXTRA
// bit 3 FNAME
// bit 4 FCOMMENT
headerView.setUint32(4, (new Date()).getTime()/1000>>>0, true);
headerView.setUint8(8, 0x00); // no extension headers
headerView.setUint8(9, 0x03); // OS type UNIX
this.push({data: new Uint8Array(headerBuffer)});
}
this.push(chunk);
};
GZipFileWorker.prototype.flush = function() {
var trailerBuffer = new ArrayBuffer(8);
var trailerView = new DataView(trailerBuffer);
trailerView.setUint32(0, this.streamInfo["crc32"]>>>0, true);
trailerView.setUint32(4, this.streamInfo["originalSize"]>>>0 & 0xffffffff, true);
this.push({data: new Uint8Array(trailerBuffer)});
};
exports.gzip = function(data, inputFormat, outputFormat, compressionOptions, onUpdate) {
var mimeType = data.contentType || data.mimeType || "";
if(! (data instanceof GenericWorker)) {
inputFormat = (inputFormat || "").toLowerCase();
data = new DataWorker(
utils.prepareContent(data.name || "gzip source",
data,
inputFormat !== "string",
inputFormat === "binarystring",
inputFormat === "base64"));
}
return new StreamHelper(
data
.pipe(new DataLengthProbe("originalSize"))
.pipe(new Crc32Probe())
.pipe(flate.compressWorker( compressionOptions || {} ))
.pipe(new GZipFileWorker()),
outputFormat.toLowerCase(), mimeType).accumulate(onUpdate);
};
在jszip/lib/index.js中,我只需要这样:
and in jszip/lib/index.js I need just this:
var gzip = require("./generate/GZipFileWorker");
JSZip.gzip = gzip.gzip;
这是这样的:
JSZip.gzip("Hello World! Hello World! Hello World! Hello World! Hello World! Hello World!", "string", "base64", {level: 3}).then(function(result) { console.log(result); })
我可以将结果粘贴到UNIX管道中,如下所示:
I can paste the result into a UNIX pipe like this:
$ echo -n "H4sIAOyR/VsAA/NIzcnJVwjPL8pJUVTwoJADAPCORolNAAAA" |base64 -d |zcat
它正确返回
Hello World! Hello World! Hello World! Hello World! Hello World! Hello World!
它也可以与文件一起使用:
It can also be used with files:
JSZip.gzip(file, "", "Blob").then(function(blob) {
xhr.setRequestProperty("Content-encoding", "gzip");
xhr.send(blob);
})
,我可以将Blob发送到我的Web服务器.我已经检查过,确实大文件是按块处理的.
and I can send the blob to my web server. I have checked that indeed the large file is processed in chunks.
对此我唯一不满意的是,最终的Blob仍被组装为一个大Blob,因此我假设它将所有压缩数据保存在内存中.最好将Blow作为该Worker管道的端点,这样,当xhr.send从Blob逐块地获取数据时,才可以从Worker管道中消耗大块.但是,由于它仅包含压缩内容,因此其影响已大大降低,而且(至少对我而言)大文件可能是多媒体文件,无论如何都无需进行gzip压缩.
The only thing I don't like about this is that the final blob is still assembled as one big Blob, so I am assuming it holds all compressed data in memory. It would be better if that Blow was an end-point of that Worker pipeline so that when the xhr.send grabs the data chunk-wise from the Blob, it would consume chunks from the Worker pipeline only then. However, the impact is lessened a lot given that it only holds compressed content, and likely (for me at least) large files would be multi-media files that won't need to be gzip compressed anyway.
我没有编写gunzip函数,因为坦率地说,我不需要一个,也不想创建一个无法正确解析gzip标头中的扩展标头的函数.一旦将压缩的内容上传到服务器(在我的情况下为S3),当我再次获取它时,我就假定浏览器会为我进行解压缩.我还没有检查.如果这成为问题,我会回头再编辑此答案.
I did not write a gunzip function, because frankly, I don't need one and I don't want to make one that fails to properly parse extension headers in the gzip headers. As soon as I have uploaded compressed content to the server (S3 in my case), when I'm fetching it again I assume the browser would do the decompressing for me. I haven't checked that though. If it's becoming a problem I'll come back end edit this answer more.
这是我在github上的叉子: https://github.com/gschadow/jszip ,拉取请求已输入.
Here is my fork on github: https://github.com/gschadow/jszip, pull request already entered.
这篇关于我的网页需要JSZip和gzip,并且JSZip具有所有成分,但是以我无法破解的方式将其隐藏的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!