如果我们假设一个PDF文档“ doc.pdf”包含简单的字符串“ hello world”。
让我们考虑以下代码:
//read the document
DDocument doc;
doc= PDDocument.load("doc.pdf");
//extract all the pages from the document and put them in a list
List pages = doc.getDocumentCatalog().getAllPages();
//extract the page number 0
PDPage page = (PDPage) pages.get(0);
//analyse the content stream
PDStream contents = page.getContents();
PDFStreamParser parser = new PDFStreamParser(contents.getStream());
//parsing the extracted contents
parser.parse();
List tokens = parser.getTokens();
for (int o = 0; o < tokens.size(); o++)
{
Object next = tokens.get(o);
//if this content is an operator
if (next instanceof PDFOperator) {
PDFOperator op = (PDFOperator) next;
/and if this operator is a Tj
if (op.getOperation().equals("Tj"))
{
//now i want to access to this string
COSString previous = (COSString) tokens.get(o - 1);
String string = previous.getString();
//rendering mode invisible the string in the document
tokens.set(o-1, COSInteger.get(3));
tokens.set(o, PDFOperator.getOperator("Tr"));
tokens.add(++o, new COSString(string));
tokens.add(++o, PDFOperator.getOperator("Tj"));
tokens.add(++o, COSInteger.get(0));
tokens.add(++o, PDFOperator.getOperator("Tr"));
tokens.add(++o, new COSString(""));
tokens.add(++o, PDFOperator.getOperator("Tj"));
}
//update the modified stream
PDStream updatedStream = new PDStream(doc);
OutputStream out = updatedStream.createOutputStream();
ContentStreamWriter tokenWriter = new ContentStreamWriter(out);
tokenWriter.writeTokens(tokens);
page.setContents(updatedStream);
}
//construct a new object that contains the string "My name is Liszt" and take (15 31) as a specific position
PDPageContentStream content = new PDPageContentStream(doc, page, true, false);
PDFont font= PDType1Font.HELVETICA;
content.setFont(font, 12);
content.beginText();
content.appendRawCommands("15 31 Td");
content.appendRawCommands("(My name is Liszt)Tj\n");
content.close();
content.endText();
doc.save("modified_doc.pdf");
}
}
现在,让我们考虑相同的文档“ doc.pdf”,但是我想编写另一个代码,在其中我要验证文档是否还包含TJ运算符,而不仅仅是Tj。
因此,我尝试编写第二个代码,但是我想要编辑它的帮助,以解决所有错误并获得与第一个代码相同的结果。
PDDocument doc;
doc= PDDocument.load("doc.pdf");
List pages = doc.getDocumentCatalog().getAllPages();
PDPage page = (PDPage) pages.get(0);
PDStream contents = page.getContents();
PDFStreamParser parser = new PDFStreamParser(contents.getStream());
parser.parse();
List tokens = parser.getTokens();
for (int o = 0; o < tokens.size(); o++)
{
Object next = tokens.get(o);
if (next instanceof PDFOperator) {
PDFOperator op = (PDFOperator) next;
if (op.getOperation().equals("Tj"))
{
COSString previous = (COSString) tokens.get(o - 1);
String string = previous.getString();
tokens.set(o-1, COSInteger.get(3));
tokens.set(o, PDFOperator.getOperator("Tr"));
tokens.add(++o, new COSString(string));
tokens.add(++o, PDFOperator.getOperator("Tj"));
tokens.add(++o, COSInteger.get(0));
tokens.add(++o, PDFOperator.getOperator("Tr"));
tokens.add(++o, new COSString(""));
tokens.add(++o, PDFOperator.getOperator("Tj"));
}else if(op.getOperation().equals("TJ")){
COSArray previous = (COSArray) tokens.get(o - 1);
for (int k = 0; k < previous.size(); k++)
{
Object arrElement = previous.getObject(k);
if (arrElement instanceof COSString)
{
COSString cosString = (COSString) arrElement;
String string = cosString.getString();
// i get errors in the instructions below
tokens.set(o-1, COSInteger.get(3));
tokens.set(o, PDFOperator.getOperator("Tr"));
tokens.add(++o, new COSString(string));
tokens.add(++o, PDFOperator.getOperator("TJ"));
tokens.add(++o, COSInteger.get(0));
tokens.add(++o, PDFOperator.getOperator("Tr"));
tokens.add(++o, new COSString(""));
tokens.add(++o, PDFOperator.getOperator("TJ"));
}
}
}
}
PDStream updatedStream = new PDStream(doc);
OutputStream out = updatedStream.createOutputStream();
ContentStreamWriter tokenWriter = new ContentStreamWriter(out);
tokenWriter.writeTokens(tokens);
page.setContents(updatedStream);
}
//how to write this object for both Tj and TJ ?
PDPageContentStream content = new PDPageContentStream(doc, page, true, false);
PDFont font= PDType1Font.HELVETICA;
content.setFont(font, 12);
content.beginText();
content.appendRawCommands("15 31 Td");
content.appendRawCommands("(My name is Liszt)TJ\n");
content.close();
content.endText();
doc.save("modified_doc.pdf");
}
}
最好的祝福,
李斯特
最佳答案
您的代码中有很多问题。
在您的最高代码部分
content.close();
content.endText();
您应该在
endText
之前致电close.
稍后,您的TJ专用代码如下所示(格式化后):
else if (op.getOperation().equals("TJ"))
{
COSArray previous = (COSArray) tokens.get(o - 1);
for (int k = 0; k < previous.size(); k++)
{
Object arrElement = previous.getObject(k);
if (arrElement instanceof COSString)
{
COSString cosString = (COSString) arrElement;
String string = cosString.getString();
// i get errors in the instructions below
tokens.set(o-1, COSInteger.get(3));
tokens.set(o, PDFOperator.getOperator("Tr"));
tokens.add(++o, new COSString(string));
tokens.add(++o, PDFOperator.getOperator("TJ"));
tokens.add(++o, COSInteger.get(0));
tokens.add(++o, PDFOperator.getOperator("Tr"));
tokens.add(++o, new COSString(""));
tokens.add(++o, PDFOperator.getOperator("TJ"));
}
}
}
在您的
k
循环中您将覆盖
o-1
列表的o
和tokens
位置。第一次这样做(删除原始的TJ操作)是有道理的,但此后没有。我建议在读取数组参数后显式删除,以后仅使用
add;
您只在前面加上
COSString
而不是TJ期望的COSArray
来添加TJ操作。我建议要么将Tj用作
COSString
参数,要么将其替换为TJ(将字符串包装到数组中)。您将忽略原始
previous
数组的数字内容。顺便说一句,你为什么不代替那个循环简单地做
else if (op.getOperation().equals("TJ"))
{
COSArray previous = (COSArray) tokens.get(o - 1);
tokens.set(o-1, COSInteger.get(3));
tokens.set(o, PDFOperator.getOperator("Tr"));
tokens.add(++o, previous);
tokens.add(++o, PDFOperator.getOperator("TJ"));
tokens.add(++o, COSInteger.get(0));
tokens.add(++o, PDFOperator.getOperator("Tr"));
}
您将必须告诉您在这种情况下要实现的目标。