本文介绍了如何使用 Apache POI 在 Word .docx 文件中正确生成 RSID 属性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在使用 Apache POI 来操作 Microsoft Word .docx 文件——即打开一个最初在 Microsoft Word 中创建的文档,修改它,将其保存到一个新文档中.

I have been using Apache POI to manipulate Microsoft Word .docx files — ie open a document that was originally created in Microsoft Word, modify it, save it to a new document.

我注意到 Apache POI 创建的新段落缺少 修订保存 ID,通常称为 RSIDrsidR.Word 使用它来识别在一个会话中对文档所做的更改,比如在保存之间.它是可选的——如果他们愿意,用户可以在 Microsoft Word 中关闭它——但实际上几乎每个人都有它,所以几乎每个文档都充满了 RSID.阅读 这篇对 RSID 的出色解释 了解更多相关信息.

I notice that new paragraphs created by Apache POI are missing a Revision Save ID, often known as an RSID or rsidR. This is used by Word to identify changes made to a document in one session, say between saves. It is optional — users could turn it off in Microsoft Word if they want — but in reality almost everyone has it on so almost every document is fulls of RSIDs. Read this excellent explanation of RSIDs for more about that.

在 Microsoft Word 文档中,word/document.xml 包含如下段落:

In a Microsoft Word document, word/document.xml contains paragraphs like this:

<w:p w:rsidR="007809A1" w:rsidRDefault="007809A1" w:rsidP="00191825">
  <w:r>
    <w:t>Paragraph of text here.</w:t>
  </w:r>
</w:p>

但是由 POI 创建的相同段落在 word/document.xml 中将如下所示:

However the same paragraph created by POI will look like this in word/document.xml:

<w:p>
  <w:r>
    <w:t>Paragraph of text here.</w:t>
  </w:r>
</w:p>

我发现我可以使用如下代码强制 POI 为每个段落添加一个 RSID:

I've figured out that I can force POI to add an RSID to each paragraph using code like this:

    byte[] rsid = ???;
    XWPFParagraph paragraph = document.createParagraph();
    paragraph.getCTP().setRsidR(rsid);
    paragraph.getCTP().setRsidRDefault(rsid);

但是我不知道我应该如何生成 RSID.

However I don't know how I should be generating the RSIDs.

POI 是否有办法生成和/或跟踪 RSID?如果没有,有什么方法可以确保我生成的 RSID 不会与文档中已有的 RSID 冲突?

Does POI have a way or generate and/or keep track of RSIDs? If not, is there any way I can ensure that an RSID that I generate doesn't conflict with one that's already in the document?

推荐答案

看起来有效 rsid 条目的列表保存在 条目的 word/settings.xml 中.XWPF 应该能够让您访问它.

It looks like the list of valid rsid entries is held in word/settings.xml in the <w:rsids> entry. XWPF should be able to give you access to that already.

您可能想要生成一个 8 位十六进制数字长随机数,检查它是否在那里,如果是,则重新生成.一旦你有一个独特的,将它添加到该列表中,然后用它标记你的段落.

You'd probably want to generate a 8 hex digit long random number, check if that's in there, and re-generate if it is. Once you have a unique one, add it into that list, then tag your paragraphs with it.

我建议您加入 poi 开发列表 (邮件列表详细信息),我们可以帮助您为它制作补丁.我认为要做的事情是:

What I'd suggest is that you join the poi dev list (mailing list details), and we can give you a hand on working up a patch for it. I think the things to do are:

  • 封装 word/settings.xml 中的 RSids 条目,让您轻松获取列表并生成新的(唯一的)
  • 段落和运行中不同 RSid 条目的包装器
  • 段落和运行的方法以获取 RSid 包装器、添加新包装器或清除现有包装器

我们应该把它带到开发列表中:)

We should take this to the dev list though :)

这篇关于如何使用 Apache POI 在 Word .docx 文件中正确生成 RSID 属性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

06-28 12:05