问题描述
我有一个来自 oracle fusion 的文件,名为 Hyderabad - Telangana
I have a file coming from oracle fusion with the name Hyderabad - Telangana
当我收到这个给服务器时,连字符变成了一个特殊字符——.
When i received this to the server, the hyphen has become a special character â€" .
我们正在对该值进行查找,但由于特殊字符而失败.
We are using lookup on this value and failing because of the special character.
我将文档下载到本地驱动器,我可以正确看到连字符.
I downloaded the document to local drive and i can see the hyphen properly.
我尝试寻找解决方案,大多数人都说这是因为编码问题.
I tried looking for the solution and most of them are saying that this is because of the encoding issue.
如何在unix中查找文件的编码?
How to find the encoding of a file in unix?
推荐答案
因为它不是普通的连字符而是 EN DASH,unicode U+2013.当以 UTF-8 编码时,它变成 "\xe2\x80\x93"
.第一个字节是 'â'
的代码,它引导我到那个路径.
Because it was not a normal hyphen but a EN DASH, unicode U+2013. When encoded in UTF-8 it becomes "\xe2\x80\x93"
. First byte is the code of 'â'
, which leads me to that path.
有趣的是,另外两个由 cp1252
字符集处理,这在西欧语言 Windows 版本中很常见,分别是:
Interestingly enough, the 2 other ones are handled by cp1252
charset which is common on West European language Windows versions and are respectively:
Byte Character in cp1252 charset Unicode code Name
0x80 € U+20AC EURO SIGN
0x93 " U+201C LEFT DOUBLE QUOTATION MARK
这篇关于连字符更改为特殊字符 –的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!