问题描述
我是Java的新手,我有这样的代码:
I am very new to Java and I have code like this:
public class Puzzle {
public static void main(String... args) {
System.out.println("Hi Guys!");
// Character myChar = new Character('\u000d');
}
}
您可以看到以下行:
Character myChar = new Character('\u000d');
已被注释掉.但是,当我运行javac
时,仍然会收到这样的错误:
is commented out. But still, I get an error like this when I run javac
:
Puzzle.java:9: error: unclosed character literal
// Character myChar = new Character('\u000d');
^
1 error
在这篇博客中,我找到了原因例外.博客说:
In this blog post I found the reason for the exception. The blog says:
在我们的代码中,当Java编译器遇到\ u000d时,它认为 作为换行符,并更改如下代码,
In our code, the when Java compiler encounters \u000d, it considers this as a newline and changes the code as below,
public class Puzzle {
public static void main(String... args) {
System.out.println("Hi Guys!");
// Character myChar = new Character('
');
}
}
对此,我有两个问题:
- 为什么Java首先解析unicode?有什么好处吗?
- 由于该行仍被注释,因此Java试图对其进行解析!这是唯一的情况吗?还是通常也解析注释行?我很困惑.
- Why does Java parse the unicode first? Are there any advantages to it?
- Because the line is still commented, Java is trying to parse it! Is this the only case it does? Or does it generally parse the commented lines too? I'm confused.
谢谢.
推荐答案
- 为什么Java首先解析unicode?有什么好处吗?
是的,在编译器进行词典分析之前,首先要替换unicode序列.
Yes, unicode sequences are first replaced before the compiler proceeds to lexicographical analysis.
例如,以下源代码会导致错误:
So for example the following source code results in error:
// String s = "\u000d";
但这是有效的:
/*String s = "\u000d";*/
由于用新行替换\u000d
时,它看起来像这样:
Because when \u000d
is replaced with a new line it will look like this:
/*String s="
";*/
使用多行注释/* */
完全可以.
Which is totally fine with the multi-line comment /* */
.
还有以下代码:
public static void main(String[] args) {
// Comment.\u000d System.out.println("I will be printed out");
// Comment.\u000a System.out.println("Me too.");
}
将打印出:
I will be printed out
Me too.
因为在替换unicode之后,两个System.out.println()
语句都将不在注释部分之外.
Because after the unicode replace both System.out.println()
statements will be outside of comment sections.
要回答您的问题:替换Unicode必须花费一些时间.有人可能认为这应该在发表评论之前或之后进行.在删除评论之前,已选择执行此操作.
To answer your question: The unicode replace has to happen some time. One could argue that this should happen before or after taking out comments. A choice was made to do this before taking out the comments.
Reasonig可能是因为注释只是另一个词汇元素,并且在识别和分析通常要替换unicode序列的词汇元素之前.
Reasonig might be because the comment is just another lexical element and prior to identify and analyze lexical elements you usually want to replace unicode sequences.
请参见以下示例:
/\u002f This is a comment line
如果放置在Java源代码中,则不会引起编译错误,因为\u002f
将被翻译为字符'/'
,并且与前面的'/'
一起将成为行注释//
的开始. >
If placed in a Java source, it causes no compile errors because \u002f
will be translated to the character '/'
and along with the preceeding '/'
will form the start of a line comment //
.
- 因为,该行仍然被注释,Java试图对其进行解析!这是唯一的情况吗?还是通常也解析注释行?我很困惑.
Java编译器不会分析注释,但仍必须对其进行解析才能知道它们的结束位置.
The Java compiler does not analyze comments but they still have to be parsed to know where they end.
这篇关于为什么Java编译器在实际编译之前会剥离所有unicode字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!