python-docx 获取标题文本

我想从 Python 中的 docx 文件中读取标题文本。我正在使用 python-docx 模块。

如果此功能已经实现，有人可以帮助我做到这一点。

我试图这样做，但没有成功。

from docx import Document

document = Document(path)
section = document.sections[0]
print(section.text)

Error:
    <class 'AttributeError'>'Section' object has no attribute 'text'

和:

from docx import Document

document = Document(path)
header = document.sections[0].header
print(header.text)

Error:
    <class 'AttributeError'>'Section' object has no attribute 'header'

最佳答案

在您提出问题时，使用 python-docx 库无法做到这一点。在 0.8.8 release (January 7, 2019) 中，添加了页眉/页脚支持。

在 Word 文档中，每个部分都有一个标题。标题有很多潜在的皱纹(例如，它们可以从一个部分链接到另一个部分，或者在偶数/奇数页面上不同)，但在简单的情况下，只有一个部分和一个不复杂的标题，您只需要通过节标题中的段落。

from docx import Document
document = Document(path_and_filename)
section = document.sections[0]
header = section.header
for paragraph in header.paragraphs:
    print(paragraph.text) # or whatever you have in mind

我正在处理一个文档，该文档的标题是用表格而不是简单文本布置的。在这种情况下，您需要使用 rows 中的 header.tables[0] 而不是段落。

关于python-docx 获取标题文本，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/48261976/