


What is the use case of encode/decode?


My understanding was that encode is used to convert string into byte string in order to be able to pass non ascii data across the program. And decode was to convert this byte string back into string.


But foll. examples shows non acsii characters getting successfully printed even if not encoded/decoded. Example:

val1="À È Ì Ò Ù Ỳ Ǹ Ẁ"
val2 = val1
print('val1 is: ',val2)

print('encoded_val1 is: ',encoded_val1)

print('decoded_encoded_val1 is: ',decoded_encoded_val1)



So what is the use case of encode and decode in python?



The environment you are working on may support those characters, in addition to that your terminal(or whatever you use to see output) may support displaying those characters. Some terminals/command lines or text editors may not support them. Apart from displaying issues, here are some actual reasons and examples:


1- When you transfer data over internet/network (eg with a socket), information is transferred as raw bytes. Non-ascii characters can not be represented by a single byte so we need a special representation for them (utf-16 or utf-8 with more than one byte). This is the most common reason I encountered.


2- Some text editors only supports utf-8. For example you need to represent your Ẁ character in utf-8 format in order to work with them. Reason for that is when dealing with text, people mostly used ASCII characters, which are just one byte. When some systems needed to be integrated with non-ascii characters people converted them to utf-8. Some people with more in-depth knowledge about text editors may give a better explanation about this point.


3- You may have a text written with unicode characters with some Chinese/Russian letters in it, and for some reason store it in your remote Linux server. But your server does not support letters from those languages. You need to convert your text to some strict format (utf-8 or utf-16) and store it in your server so you can recover them later.

以下是 UTF-8格式的一些解释.如果您有兴趣,还可以找到有关该主题的其他文章.

Here is a little explanation of UTF-8 format. There are also other articles about the topic if you are interested.


08-13 10:21