本文介绍了下载时如何部分显示PDF文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

限时删除!!

根据PDF 1.7规范,第3.4节( http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/pdf_reference_1-7.pdf ,第90页):

According to the PDF 1.7 specification, Sec 3.4 (http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/pdf_reference_1-7.pdf, page 90):

  • 一个单行标题,用于标识要指定的PDF规范的版本 文件符合

  • A one-line header identifying the version of the PDF specification to which the file conforms

主体,该主体包含构成文档中包含的文档的对象. 文件

A body containing the objects that make up the document contained in the file

包含有关间接信息的交叉引用表 文件中的对象

A cross-reference table containing information about the indirect objects in the file

一个预告片,提供了交叉引用表的位置和某些 文件正文中的特殊对象

A trailer giving the location of the cross-reference table and of certain special objects within the body of the file

基本上,该结构具有标头,然后是正文内容,然后是交叉引用表,最后是提供外部参照表位置的预告片.这里的关键部分是trailerxref表位于文件的,并且xref表包含正文内容的相关元数据(主要是10位数字)字节偏移).

Basically, the structure has the header, followed by the body content, then the cross reference table, and finally the trailer which gives the location of the xref table. The key part here is that the trailer and xref tables are at the end of the file, and the xref table contains the pertinent metadata of the body content (mainly the 10-digit byte offset).

鉴于外部参照表本身位于 PDF文件的最后:

Given that the xref table itself is located at the very end of a PDF file:

  • 在整个文件下载完成之前,我的浏览器(Google Chrome)如何能够部分显示PDF文件(前一百页左右)?

查看我部分下载的PDF文件的屏幕截图:

See screenshot of my partially downloaded PDF file:

推荐答案

OP描述的PDF文件类型也称为网络优化" (市场营销术语)或" linearized "(PDF术语中的技术术语).

The type of PDF files the OP describes is also known as "web optimized" (marketing term) or "linearized" (technical term in PDF parlance).

需要注意的是,只有在满足两个额外条件(文件线性化功能之上)的情况下,该方法才有效:

It has to be noted that it only works if two extra conditions (on top of the linearization feature of the files) are met:

  1. PDF查看器需要能够处理这些类型的PDF并利用线性化功能.
  2. 服务线性化PDF的(远程)主机需要支持字节流" .

如果服务器不支持字节流,或者PDF文件未线性化,则必须先完整下载整个文件 静止 ,然后查看器才能显示任何页面.

If byte-streaming is not supported by the server or if the PDF file is not linearized, the entire file still needs to be downloaded completely before it the viewer can display any page.

OP引用的有关PDF文件结构的描述不适用于线性化的PDF文件.这些的组织方式略有不同:

The description about the PDF file structure quoted by the OP does not apply to linearized PDF files. These are organized in a slightly different way:

  1. 适用于PDF对象排序的特殊规则(标准" PDF可以具有任意任意顺序的对象).
  2. PDF文档需要包含一些称为提示表"的附加结构,以确保其中的有效导航(即使尚未完全下载).

关于其他结构,线性化PDF包含两组对象:

Regarding the additional structures, a linearized PDF contains its objects in two groups:

  1. 在第一组中是文档目录,所有文档级对象以及属于第一个要显示的页面的所有对象(不一定是页面0"!).这些对象应按顺序编号.

  1. In the first group is the document catalogue, all document-level objects, and all objects belonging to the first-to-be-displayed page (not necessarily "page 0"!). The objects shall be numbered sequentially.

第二组包含所有其他对象.

The second group holds all the other objects.

这些组应按 两个 xref表部分建立索引.

These groups shall be indexed by two xref table sections.

  1. 第一组的xref部分出现在第一个间接对象之后,非常靠近文件的开头.
  2. 第二组的xref部分位于文件的末尾(就像在标准的非线性PDF中一样).
  1. The first group's xref section appears immediately after the first indirect object, very close to the beginning of the file.
  2. The second group's xref section is positioned at the end of the file (just as in standard, non-linearized PDFs).

紧接%PDF-1.x标题行的第一个对象应包含一个字典键,指示文件的/Linearized属性.

The first object immediately after the %PDF-1.x header line shall contain a dictionary key indicating the /Linearized property of the file.

这种总体结构使合格的读者可以非常快速地了解对象地址的完整列表,而无需从头到尾下载完整的文件:

This overall structure allows a conforming reader to learn the complete list of object addresses very quickly, without needing to download the complete file from beginning to end:

  • 在下载完整文件之前,查看者可以非常快速地显示第一页.

  • The viewer can display the first page(s) very fast, before the complete file is downloaded.

用户可以单击缩略图页面预览(或文件的ToC中的链接),以便在显示第一页后立即跳转到第445页,并且然后,查看者可以通过字节范围请求来请求远程服务器传递这些乱序",从而请求页面445所需的所有对象,从而使查看者可以更快地显示此页面. (尽管用户无序阅读页面,但完整文档的下载仍会在后台继续进行...)

The user can click on a thumbnail page preview (or a link in the ToC of the file) in order to jump to, say, page 445, immediately after the first page(s) have been displayed, and the viewer can then request all the objects required for page 445 by asking the remote server via byte range requests to deliver these "out of order" so the viewer can display this page faster. (While the user reads pages out of order, the downloading of the complete document will still go on in the background...)

可以在 标准" 附录F中找到PDF线性化"的技术细节. /PDF32000-Adobe.pdf#page=685"rel =" nofollow> Adob​​e原始的PDF 1.7规范 (约11 MB,它本身就是这种线性化PDF文件的示例! )

The technical details of PDF "linearization" can be found in the 'normative' Appendix F of Adobe's original PDF 1.7 Specification (ca. 11 MByte -- which in itself is an example of such a linearized PDF file!)

这篇关于下载时如何部分显示PDF文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

1403页,肝出来的..

09-07 00:48