



我使用indy的TIdIMAP4组件从邮件服务器检索数据。当我尝试检索邮件正文,然后重音字母,如ä,ü等被分别转换为= E4,= FC分别使用字符集ISO-8859-1。



charset 不是生成 = E4 = FC ,它是 Content-Transfer-Encoding $ E4 $ FC ä的二进制表示和ISO-8859-1中的ü,但它们是8位值。电子邮件仍然在很大程度上是一个7位的环境。除非客户端和服务器在其通信期间协商8位传输,否则必须以7位兼容方式对高于 $ 7F 的字节八位字节进行编码,以安全地传递电子邮件网关,特别是仍然存在的遗留物。 quoted-printable 是电子邮件中用于文本内容的常用7位字节编码。 base64 是另一个,但它不是人类可读的,所以它倾向于用于二进制数据而不是文本数据(虽然它可以用于文本)。

在任何情况下,您都无法使服务器以另一种编码方式向您提供电子邮件数据。服务器仅仅传送原始由发送者传递给它的原始电子邮件数据。如果你想要的数据在UTF-8,那么你必须重新编码它自己下载后。 Indy会为您处理解码。

i'm currently creating a little mail client and facing a problem with charset.I use indy's TIdIMAP4 component to retrieve data from mail-server. When i try to retrieve mail bodies then accent letters like ä, ü etc are converted to =E4, =FC respectively as it is using charset ISO-8859-1.

How can i make server to send me data in another charset, like utf-8? What would be the best solution for that problem?

Thanks in advance!


It is not the charset that is producing strings like =E4 and =FC, it is the Content-Transfer-Encoding instead. $E4 and $FC are the binary representations of ä and ü in ISO-8859-1, but they are 8-bit values. Email is still largely a 7-bit environment. Unless both clients and servers negotiate 8-bit transfers during their communications, then byte octets above $7F have to be encoded in a 7-bit compatible manner to pass through email gateways safely, especially legacy ones that still exist. quoted-printable is a commonly used 7-bit byte encoding in email for textual content. base64 is another one, but it is not human-readible so it tends to be used for binary data instead of textual data (though it can be used for text).

In any case, you cannot make the server deliver the email data to you in another encoding. The server is merely delivering the original email data as-is that was originally delivered to it by the sender. If you want the data in UTF-8, then you have to re-encode it yourself after downloading it. Indy will handle the decoding for you.


08-19 10:47