问题描述
当我在带有两个不同项目的代码下运行时,我将获得不同的输出.
When I run below code with two different project I get different outputs.
String myString = "Türkçe Karakter Testi : ğüşiöçĞÜİŞÇÖĞ";
String value = new String(myString.getBytes("UTF-8"));
System.out.println(value);
第一个项目是在Netbeans 8.2中创建的非Maven Java应用程序.它给了我预期的以下结果.
First project is non-maven java application created in Netbeans 8.2. And it gives me following result which i expect.
TürkçeKarakter Testi:ğüşiöçĞÜİŞÇÖĞ"
"Türkçe Karakter Testi : ğüşiöçĞÜİŞÇÖĞ"
第二个项目是maven java应用程序项目,它是通过以下pom.xml文件以相同的方式创建的:
And second project is maven java application project which is created in same way with following pom.xml file:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.mycompany</groupId>
<artifactId>mavenproject1</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
</properties>
</project>
这个项目给我:
TärkçeKarakter Testi:ğüşiö¶§Ä?ÜİÅ?ÇÃâÄ?"
"Türkçe Karakter Testi : ğüşiöçÄ?ÜİÅ?ÇÖÄ?"
我用notepad ++检查了两个文件,并且两个文件都使用UTF-8编码
I checked both file with notepad++ and both of them are encoded with UTF-8
推荐答案
您缺少new String()
构造函数的编码,因此它使用的是平台的默认编码,不是 UTF-8
(看起来像ISO-8859-1
的某些变体).
You're missing the encoding from your new String()
constructor, so it's using the default encoding of your platform which isn't UTF-8
(looks like some variant of ISO-8859-1
).
如果使用以下代码(虽然没有多大意义,但是显示了默认的编码错误),您会看到它在所有位置都正确打印.
If you use the following code (which doesn't make much sense, but shows the default encoding botching things), you'll see that it's printed properly everywhere.
String myString = "Türkçe Karakter Testi : ğüşiöçĞÜİŞÇÖĞ";
String value = new String(myString.getBytes("UTF-8"), "UTF-8");
System.out.println(value);
这是什么课?始终指定在处理byte/character
转换时要使用的编码!其中包括String.getBytes()
,new String()
和new InputStreamReader()
之类的方法.
What's the lesson here? Always specify the encoding to use when dealing with byte/character
conversion! This includes such methods as String.getBytes()
, new String()
and new InputStreamReader()
.
这只是字符编码在后面咬住您的众多方式之一.这似乎是一个简单的问题,但始终吸引着毫无戒心的开发人员.
This is just one of the many ways that character encoding can bite you in the behind. It may seem like a simple problem, but it catches unsuspecting developers all the time.
这篇关于Maven UTF-8编码问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!