问题描述
我正在尝试从 pdf 文件中获取特定文本或行的字体大小或格式(粗体等),但直到现在都没有成功.
I'm trying to get the font size or format (bold etc.) of a specific text or line from a pdf file, but without any success until now.
使用如下所示的 PDFTextStripper 只会得到纯文本
Using the PDFTextStripper like below will only get the plain text
PDFTextStripper stripper = new PDFTextStripper();String actualText = stripper.getText(actualDoc);
你能帮我解决这个问题吗?谢谢.
Can you, please, help me with this?thanks.
推荐答案
您需要扩展 PDFTextStripper
并覆盖 PDFTextStripper#processTextPosition
.此方法使您可以访问保存字体属性的 TextPosition
对象.然后您需要收集位于指定框(您的行)中的所有 TextPositions 并将它们放在一起.
You need to extend PDFTextStripper
and overwrite PDFTextStripper#processTextPosition
. This method gives you access to a TextPosition
object, in which font attributes are saved.Then you need to collect all TextPositions which are located in a specified box (your line) and put them together.
这篇关于PDFbox - 获取行或文本字体大小/格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!