本文介绍了为什么根据邮件客户端对相同内容进行不同的解码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的代码检查一个邮箱,并将每封邮件转发给另一个用户.
但是我发现,根据邮件客户端,相同内容的解码方式有所不同(我的意思是,通过account @ gmail.com,account @ naver.com等发送时).

My code checks a mailbox, and forwards every mail to another user.
But I found out that the same contents are decoded differently according to mail clients(I mean, when sent with [email protected], with [email protected], and etc).

例如:我键入的内容,
主题:subject
内容:这是内容

For example:what I typed,
subject: subject
content: this is content

对于邮件客户端1:
358 2020-04-22 18:12:23,249:运行:调试:主题来自:=?utf-8?B?c3ViamVjdA ==?=
359 2020-04-22 18:12:23,249:运行:调试:内容来自:dGhpcyBpcyBjb250ZW50Cg ==

for mail client 1:
358 2020-04-22 18:12:23,249: run: DEBUG: subject has come as: =?utf-8?B?c3ViamVjdA==?=
359 2020-04-22 18:12:23,249: run: DEBUG: content has come as: dGhpcyBpcyBjb250ZW50Cg==

对于邮件客户端2:
178 2020-04-22 18:12:09,636:运行:调试:主题来自:=?utf-8?B?c3ViamVjdA ==?=
179 2020-04-22 18:12:09,636:运行:调试:内容来自:dGhpcyBpcyBjb250ZW50Cg ==

for mail client 2:
178 2020-04-22 18:12:09,636: run: DEBUG: subject has come as: =?utf-8?B?c3ViamVjdA==?=
179 2020-04-22 18:12:09,636: run: DEBUG: content has come as: dGhpcyBpcyBjb250ZW50Cg==

对于邮件客户端3:
300 2020-04-22 18:12:16,494:运行:调试:主题来自:主题
301 2020-04-22 18:12:16,494:运行:调试:内容来自:这是内容

for mail client 3:
300 2020-04-22 18:12:16,494: run: DEBUG: subject has come as: subject
301 2020-04-22 18:12:16,494: run: DEBUG: content has come as: this is content

对于1和2,它们是相同的.
但是对于3,则有所不同.

For 1 and 2, they are the same.
But for 3, it is different.

我的代码使用imaplib示例:

My code using imaplib sample:

typ, rfc = self.mail.fetch(num, '(RFC822)')
raw_email = rfc[0][1]
raw_email_to_utf8 = raw_email.decode('utf-8')
msg=email.message_from_string(raw_email_to_utf8)
content = msg.get_payload() #This is printed for the above debugging log.

因此,有些邮件发送的内容很奇怪.(主题再次编码正确)

Because of this, some mails are sent with wierd contents.(subjects are encoded well again)

为什么会有这种区别?如何获得不同解码的内容?

Why this difference, and how can I get the contents for differently decoded ones?

推荐答案

某些事情正在执行不必要的编码.这是不必要的,但不是禁止的.

Something is doing unnecessary encoding. That's unnecessary, but not prohibited.

RFC2047编码有时是必需的,但始终是合法的(因为允许它总是比制定精确的规则更简单).您必须检测RFC2047编码并在存在时对其进行解码.如果一个单词以=?开头,以?=结尾,并且恰好包含两个问号,则该单词将进行2047编码.有一些库或函数可以对大多数或所有语言进行解码,请搜索"rfc2047".

RFC2047 encoding is necessary sometimes, but legal always (because permitting it always was simpler then making precise rules). You have to detect RFC2047 encoding and decode it when present. If a word starts with =?, ends with ?= and contains precisely two question marks, then it is 2047-encoded. There are libraries or functions to decode for most or all languages, search for "rfc2047".

这篇关于为什么根据邮件客户端对相同内容进行不同的解码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 05:09