本文介绍了openpyxl - 如何保留 xlsx 自定义属性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何保留正在使用 openpyxl 修改的 xlsx 模板中的自定义属性?当我使用 openpyxl save() 工作簿时,这些自定义属性消失了!

自定义属性可以在这里找到:-

在 Mac 上 -> 转到 Excel 中的文件菜单 -> 属性 ... -> 自定义选项卡 ->属性部分

解决方案

我发布了一个纯 python 解决方案来读取和编写 Workbook.CustomDocumentProperties 只是因为我目前也感受到 openpyxl,我需要一个快速解决个人自动化项目的方法.>

事实上,我会尝试将这个功能(希望以后的 Worksheet.CustomProperties)实现到 openpyxl 我自己,如果我能弄清楚如何完成图书馆需要的所有管道:https://foss.heptapod.net/openpyxl/openpyxl/-/issues/1003

更新:我推送了我的贡献,应该很快就会被接受和合并:) https://foss.heptapod.net/openpyxl/openpyxl/-/merge_requests/384

所以现在,这里有一个解决方法,将 .xlsx 转换为 .zip,然后直接在 zip 中读取和写入 .xml 文件,最后重命名为 .xlsx.

要阅读 Workbook.CustomDocumentProperties,您可以执行此操作 - 仅对这个很棒的答案稍作修改:https://stackoverflow.com/a/46919795/9792594

from lxml import etree as ET导入压缩文件def get_custom_doc_properties(文件名):path_file = os.path.abspath(文件名)base, ext = os.path.splitext(path_file)zip_filename = base + .zip"os.rename(path_file, zip_filename)main_ns = "{http://schemas.openxmlformats.org/spreadsheetml/2006/main}";docPr_ns = "{http://schemas.openxmlformats.org/officeDocument/2006/custom-properties}";docPr_type = "{http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes}";#i4, r8, 文件时间, bool, lpwstrr_ns = "{http://schemas.openxmlformats.org/officeDocument/2006/relationships}";cusPr_type = http://schemas.openxmlformats.org/officeDocument/2006/relationships/customProperty"使用 zipfile.ZipFile(zip_filename) 作为 zip:props = zip.open('docProps/custom.xml')text = props.read()xml = ET.fromstring(text)workbook_props = {}对于 XML 中的孩子:如果(child.tag == f{docPr_ns}property"):对于 cusPr 儿童:workbook_props[child.attrib['name']] = cusPr.text返回 workbook_props#像这样调用:get_custom_doc_properties(f'./example.xlsx')

向已经有自定义 doc props 的文档添加一个 props(因此已经有一个 'docProps/custom.xml' 文件)非常简单,我们只需在 xml 中附加一个自定义属性即可.

>

(但是,如果文档当前没有自定义 doc props,那么我们需要从头开始生成 'docProps/custom.xml' 文件,并添加内容覆盖和关系 - 参见代码注释):

import os从 lxml 导入 etree 作为 ET导入压缩文件进口商铺导入日期时间从临时文件导入 NamedTemporaryFiledef set_workbook_custom_document_properties(文件名,cus_doc_prop_name,cus_doc_prop_val):如果不是 isinstance(cus_doc_prop_name, str):打印(你必须提供一个字符串作为'cus_doc_prop_name'")返回如果是实例(cus_doc_prop_val,str):docPr_type_suffix = "lpwstr";cus_doc_prop_str = cus_doc_prop_valelif isinstance(cus_doc_prop_val, int):docPr_type_suffix = "i4";cus_doc_prop_str = str(cus_doc_prop_val)elif isinstance(cus_doc_prop_val,浮点数):docPr_type_suffix = "r8";cus_doc_prop_str = str(cus_doc_prop_val)elif isinstance(cus_doc_prop_val, bool):docPr_type_suffix = "bool";cus_doc_prop_str = str(cus_doc_prop_val)elif isinstance(cus_doc_prop_val, datetime.datetime):docPr_type_suffix =文件时间";cus_doc_prop_str = cus_doc_prop_val.strftime(%Y-%m-%dT%H:%M:%SZ")别的:print(您必须提供一个字符串、整数、浮点数、布尔值或日期,作为‘cus_doc_prop_val’")返回path_file = os.path.abspath(文件名)base, ext = os.path.splitext(path_file)zip_filename = base + .zip"os.rename(path_file, zip_filename)main = "http://schemas.openxmlformats.org/spreadsheetml/2006/main";main_ns = "{%s}";% 主要的docPr = "http://schemas.openxmlformats.org/officeDocument/2006/custom-properties";docPr_ns = "{%s}";% 文档docPr_type = http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes"docPr_type_ns = "{%s}";% docPr_type #i4、r8、文件时间、bool、lpwstrdocPr_rel_type = http://schemas.openxmlformats.org/officeDocument/2006/relationships/custom-properties"docPr_content_type = application/vnd.openxmlformats-officedocument.custom-properties+xml"r_ns = "{http://schemas.openxmlformats.org/officeDocument/2006/relationships}";cusPr_type = http://schemas.openxmlformats.org/officeDocument/2006/relationships/customProperty"xml_declaration = '<?xml version="1.0";编码=UTF-8"独立=是"?>'base_xml = '{dec}

How do I preserve custom properties from xlsx template which I am modifying with openpyxl? When I save() workbook using openpyxl these custom properties vanish!

解决方案

I am posting a pure python solution to reading and writing Workbook.CustomDocumentProperties just because I am currently also feeling the pain of not having this in openpyxl, and I needed a quick workaround for a personal automation project.

In fact, I will try to implement this feature (and hopefully later Worksheet.CustomProperties) into openpyxl myself if I can get my head around how to do all the plumbing the library needs: https://foss.heptapod.net/openpyxl/openpyxl/-/issues/1003

Update: I pushed my contribution and it should be accepted and merged shortly :) https://foss.heptapod.net/openpyxl/openpyxl/-/merge_requests/384

So for now, here is a workaround, converting the .xlsx to .zip, then reading and writing the .xml files in the zip directly, and then renaming to .xlsx at the end.

To read Workbook.CustomDocumentProperties you can do this - only very slightly modified from this great answer: https://stackoverflow.com/a/46919795/9792594

from lxml import etree as ET
import zipfile

def get_custom_doc_properties(filename):
    path_file = os.path.abspath(filename)
    base, ext = os.path.splitext(path_file)
    zip_filename = base + ".zip"
    os.rename(path_file, zip_filename)

    main_ns = "{http://schemas.openxmlformats.org/spreadsheetml/2006/main}"
    docPr_ns = "{http://schemas.openxmlformats.org/officeDocument/2006/custom-properties}"
    docPr_type = "{http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes}" #i4, r8, filetime, bool, lpwstr
    r_ns = "{http://schemas.openxmlformats.org/officeDocument/2006/relationships}"
    cusPr_type = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/customProperty"

    with zipfile.ZipFile(zip_filename) as zip:
        props = zip.open('docProps/custom.xml')
        text = props.read()
        xml = ET.fromstring(text)
        workbook_props = {}
        for child in XML:
           if (child.tag == f"{docPr_ns}property"):
                for cusPr in child:
                    workbook_props[child.attrib['name']] = cusPr.text
    return workbook_props

#call like this:
get_custom_doc_properties(f'./example.xlsx')

And to add one prop to a document which already has custom doc props (and therefore already has a 'docProps/custom.xml' file), is pretty easy and we just append one more custom property to the xml.

(However, if the document had no current custom doc props, then we need to generate the 'docProps/custom.xml' file from scratch, as well as add a content override and a relationship - see code comments):

import os
from lxml import etree as ET
import zipfile
import shutil
import datetime
from tempfile import NamedTemporaryFile

def set_workbook_custom_document_properties(filename, cus_doc_prop_name, cus_doc_prop_val):

    if not isinstance(cus_doc_prop_name, str):
        print("you must supply a string as the 'cus_doc_prop_name'")
        return

    if isinstance(cus_doc_prop_val, str):
        docPr_type_suffix = "lpwstr"
        cus_doc_prop_str = cus_doc_prop_val
    elif isinstance(cus_doc_prop_val, int):
        docPr_type_suffix = "i4"
        cus_doc_prop_str = str(cus_doc_prop_val)
    elif isinstance(cus_doc_prop_val, float):
        docPr_type_suffix = "r8"
        cus_doc_prop_str = str(cus_doc_prop_val)
    elif isinstance(cus_doc_prop_val, bool):
        docPr_type_suffix = "bool"
        cus_doc_prop_str = str(cus_doc_prop_val)
    elif isinstance(cus_doc_prop_val, datetime.datetime):
        docPr_type_suffix = "filetime"
        cus_doc_prop_str = cus_doc_prop_val.strftime("%Y-%m-%dT%H:%M:%SZ")
    else:
        print("you must supply a string, int, float, bool, or date, as the 'cus_doc_prop_val'")
        return

    path_file = os.path.abspath(filename)
    base, ext = os.path.splitext(path_file)
    zip_filename = base + ".zip"
    os.rename(path_file, zip_filename)

    main = "http://schemas.openxmlformats.org/spreadsheetml/2006/main"
    main_ns = "{%s}" % main
    docPr = "http://schemas.openxmlformats.org/officeDocument/2006/custom-properties"
    docPr_ns = "{%s}" % docPr
    docPr_type = "http://schemas.openxmlformats.org/officeDocument/2006/docPropsVTypes"
    docPr_type_ns = "{%s}" % docPr_type #i4, r8, filetime, bool, lpwstr
    docPr_rel_type = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/custom-properties"
    docPr_content_type = "application/vnd.openxmlformats-officedocument.custom-properties+xml"
    r_ns = "{http://schemas.openxmlformats.org/officeDocument/2006/relationships}"
    cusPr_type = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/customProperty"
    xml_declaration = '<?xml version="1.0" encoding="UTF-8" standalone="yes"?>'
    base_xml = '{dec}<Properties xmlns="{docPr}" xmlns:vt="{docPr_type}"></Properties>'.format(dec=xml_declaration, docPr=docPr, docPr_type=docPr_type).encode('utf-8')

    with NamedTemporaryFile() as tmp_file:
        tmpname = os.path.basename(tmp_file.name)
        with zipfile.ZipFile(zip_filename, 'r') as zip_in:
            with zipfile.ZipFile(tmpname, 'w') as zip_out:
                zip_out.comment = zip_in.comment # preserve the comment
                custom_present = 'docProps/custom.xml' in zip_in.namelist()
                for item in zip_in.infolist():
                    if item.filename == 'docProps/custom.xml':
                        custom_xml = ET.fromstring(zip_in.read(item.filename))
                    elif custom_present == False and item.filename == '_rels/.rels':
                        rels_xml = ET.fromstring(zip_in.read(item.filename))
                    elif custom_present == False and item.filename == '[Content_Types].xml':
                        content_types_xml = ET.fromstring(zip_in.read(item.filename))
                    else:
                        zip_out.writestr(item, zip_in.read(item.filename))

                if custom_present:
                    # if custom.xml is already present we just need to append:
                    max_pid = 1
                    for node in custom_xml:
                        max_pid = max(int(node.attrib['pid']), max_pid)
                else:
                    # if custom.xml is not present, we need to create it
                    # and also to add an override to [Content_Types].xml
                    # and also to add a relationship to _rels/.rels
                    custom_xml = ET.parse(BytesIO(base_xml)).getroot()
                    max_pid = 1
                    child_override = ET.SubElement(content_types_xml, "Override")
                    child_override.attrib['ContentType'] = docPr_content_type
                    child_override.attrib['PartName'] = '/docProps/custom.xml'
                    zip_out.writestr('[Content_Types].xml', ET.tostring(content_types_xml))
                    max_rid = 0
                    for node in rels_xml:
                        max_rid = max(int(node.attrib['Id'].replace("rId", "")), max_rid)
                    child_rel = ET.SubElement(rels_xml, "Relationship")
                    child_rel.attrib['Type'] = docPr_rel_type
                    child_rel.attrib['Target'] = 'docProps/custom.xml'
                    child_rel.attrib['Id'] = "rID" + str(max_rid + 1)
                    zip_out.writestr('_rels/.rels', ET.tostring(rels_xml))

                child = ET.SubElement(custom_xml, "property")
                child.attrib['name'] = cus_doc_prop_name
                child.attrib['pid'] = str(max_pid + 1)
                child.attrib['fmtid'] = "{D5CDD505-2E9C-101B-9397-08002B2CF9AE}"
                val = ET.SubElement(child, f"{docPr_type_ns}{docPr_type_suffix}")
                val.text = cus_doc_prop_str
                print(ET.tostring(custom_xml, pretty_print=True))
                zip_out.writestr('docProps/custom.xml', ET.tostring(custom_xml))
                zip_out.close()
            zip_in.close()
        shutil.copyfile(tmpname, zip_filename)
    os.rename(zip_filename, path_file)

#call it like this:
set_workbook_custom_document_properties(f'./example.xlsx', "testDocProp7", 2.5)

这篇关于openpyxl - 如何保留 xlsx 自定义属性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-30 19:54