本文介绍了将不同内容类型的MHT文件提取到多个MHT文件中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在编写一个mht脚本来解析一个mht文件,并从父级提取零件消息并将它们写到一个单独的mht文件中
I am writing an mht script to parse an mht file and extract the part message from the parent and write them to a separate mht file
我编写了以下函数,该函数在file_location打开一个mht文件并搜索特定的content_id并将其写入新的mht文件
I wrote the below function which opens a mht file at file_location and searches for specific content_id and writes it to a new mht file
def extract_content(self, file_location, content_id,extension):
first_part = file_location.split(extension)[0]
#checking if file exists
new_file = first_part + "-" + content_id.split('.')[0] + extension
while os.path.exists(new_file):
os.remove(new_file)
with open(file_location, 'rb') as mime_file, open(new_file, 'w') as output:
***#Extracting the message from the mht file***
message = message_from_file(mime_file)
t = mimetypes.guess_type(file_location)[0]
#Walking through the message
for i, part in enumerate(message.walk()):
#Check the content_id if the one we are looking for
if part['Content-ID'] == '<' + content_id + '>':
***witing the contents***
output.write(part.as_string(unixfrom=False))
显然,对于 application/pdf和application/octet-stream ,我无法在IE中打开输出部分.
Apparently I am not able to open the output parts in IE in the case of application/pdf and application/octet-stream.
谢谢
推荐答案
尝试一下:
...
if m['Content-type'].startswith('text/'):
m["Content-Transfer-Encoding"] = "quoted-printable"
else:
m["Content-Transfer-Encoding"] = "base64"
m.set_payload(part.get_payload())
****Writing to output****
info = part.as_string(unixfrom=False)
info = info.replace('application/octet-stream', 'text/plain')
output.write(info)
...
告诉我它是否有效.
这篇关于将不同内容类型的MHT文件提取到多个MHT文件中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!