问题描述
我在运行XP的Windows计算机上的Visual Studio 2008上的C#项目上运行doxygen(1.5.8)。在生成乳胶代码时,会包含一些非法序列。它始终涉及以下序列:(具有二重病的拉丁语-I,类似于二进制移位运算符和西班牙语的开放问号)。我已经看到它发生在使用{\bf系统}的上下文中,但是也许还有其他情况。
生成的乳胶文件读取
\begin {CompactItemize}
\item
使用{\bf System}
\end {CompactItemize}
虽然来源很简单:
使用系统;
使用System.Collections.Generic;
使用System.Linq;
某些奇怪的Windows BOF字符?似乎只是在使用System之前;指令(每个文件的第一个)。
这是UTF-8编码字符U + FEFF(字节顺序标记)的ISO-8859-1表示形式。 BOM旨在用作UTF-16文件中的第一个代码点,而不应在UTF-8文件中使用,但是不幸的是,有一些非常愚蠢的工具默认会生成它。而且,如果您是通过串联其他文件中的一些文本来创建文件,则甚至可以在文档中间添加BOM。
找到将文件另存为 UTF-
ETA重新更新了问题:
在使用之前,在十六进制编辑器中检查该源是否存在隐藏的人造BOM。
I'm running doxygen (1.5.8) on a C# project off of Visual Studio 2008 on a Windows machine running XP. While generating the latex code, some illegal sequences are included. It always involves the following sequence: "" (a latin-I with a dieresis, something like the binary shift operator, and a Spanish open-question-mark). I've seen it happen in the context "using {\bf System}", but maybe there are others.
The generated latex file reads
\begin{CompactItemize}
\item
using {\bf System}
\end{CompactItemize}
While the source is simply:
using System;
using System.Collections.Generic;
using System.Linq;
Some strange Windows BOF character? It seems it's only before the using System; directive (the first of each file).
That's an ISO-8859-1 representation of the UTF-8 encoded character U+FEFF, the BYTE ORDER MARK. The BOM is intended for use as the first code point in UTF-16 files and should not be used in UTF-8 files, but there are some very stupid tools that produce it by default, unfortunately. And if you are creating files by concatenating bits of text from other files you can even end up with BOMs in the middle of your document.
Find the editor that is saving files as "UTF-8 with BOM" and burn it.
ETA re updated question:
Check that source in a hex editor for a hidden faux-BOM before the ‘using’.
这篇关于生成的乳胶源中的字符无效的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!