问题描述
编码/解码的用例是什么?
What is the use case of encode/decode?
我的理解是,编码用于将字符串转换为字节字符串,以便能够在程序中传递非ascii数据.解码是将该字节字符串转换回字符串.
My understanding was that encode is used to convert string into byte string in order to be able to pass non ascii data across the program. And decode was to convert this byte string back into string.
可是.示例显示了即使不进行编码/解码也可以成功打印非acsii字符.示例:
But foll. examples shows non acsii characters getting successfully printed even if not encoded/decoded. Example:
val1="À È Ì Ò Ù Ỳ Ǹ Ẁ"
val2 = val1
print('val1 is: ',val2)
encoded_val1=val1.encode()
print('encoded_val1 is: ',encoded_val1)
decoded_encoded_val1=encoded_val1.decode()
print('decoded_encoded_val1 is: ',decoded_encoded_val1)
输出:
那么在python中编码和解码的用例是什么?
So what is the use case of encode and decode in python?
推荐答案
您正在使用的环境可能会支持这些字符,此外您的终端(或用于查看输出的任何内容)可能会支持显示这些字符.某些终端/命令行或文本编辑器可能不支持它们.除了显示问题之外,这里还有一些实际原因和示例:
The environment you are working on may support those characters, in addition to that your terminal(or whatever you use to see output) may support displaying those characters. Some terminals/command lines or text editors may not support them. Apart from displaying issues, here are some actual reasons and examples:
1-当您通过Internet/网络(例如,使用套接字)传输数据时,信息将以原始字节的形式传输.非ASCII字符不能用一个字节表示,因此我们需要为它们一个特殊的表示形式(utf-16或utf-8具有一个以上的字节).这是我遇到的最常见的原因.
1- When you transfer data over internet/network (eg with a socket), information is transferred as raw bytes. Non-ascii characters can not be represented by a single byte so we need a special representation for them (utf-16 or utf-8 with more than one byte). This is the most common reason I encountered.
2-某些文本编辑器仅支持utf-8.例如,您需要按utf-8格式表示Ẁ字符与他们一起工作.原因是在处理文本时,人们通常使用仅一个字节的ASCII字符.当某些系统需要与非ASCII字符集成时,人们会将其转换为utf-8.一些对文本编辑器有更深入了解的人可能会对此提供更好的解释.
2- Some text editors only supports utf-8. For example you need to represent your Ẁ character in utf-8 format in order to work with them. Reason for that is when dealing with text, people mostly used ASCII characters, which are just one byte. When some systems needed to be integrated with non-ascii characters people converted them to utf-8. Some people with more in-depth knowledge about text editors may give a better explanation about this point.
3-您可能有一个用unicode字符编写的文本,其中带有一些中/俄字母,并出于某种原因将其存储在远程Linux服务器中.但是您的服务器不支持这些语言的字母.您需要将文本转换为某种严格的格式(utf-8或utf-16),并将其存储在服务器中,以便以后恢复.
3- You may have a text written with unicode characters with some Chinese/Russian letters in it, and for some reason store it in your remote Linux server. But your server does not support letters from those languages. You need to convert your text to some strict format (utf-8 or utf-16) and store it in your server so you can recover them later.
以下是 UTF-8格式的一些解释.如果您有兴趣,还可以找到有关该主题的其他文章.
Here is a little explanation of UTF-8 format. There are also other articles about the topic if you are interested.
这篇关于为什么我们需要在python中进行编码和解码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!