c - C:低级字符格式:(输入+换行符)使用fgetc

我正在C上的一个项目中，该项目读取文本文件并将其转换为布尔数组。
首先，我将文件读取为大小为n的字符串（是无符号字符数组），然后使用函数将该字符串转换为大小为n * 8的布尔数组。该功能运行完美，对此毫无疑问。

我使用以下代码从文件中获取字符串：

unsigned char *Data_in; // define pointer to string
int i;

FILE* sp = fopen("file.txt", "r"); //open file

fseek(sp, 0, SEEK_END);            // points sp to the end of file
int data_dim = ftell(sp);          // Returns the position of the pointer (amount of bytes from beginning to end)
rewind(sp);                        // points sp to the beginning of file

Data_in = (unsigned char *) malloc ( data_dim * sizeof(unsigned char) ); //allocate memory for string
unsigned char carac; //define auxiliary variable

for(i=0; feof(sp) == 0; i++)       // while end of file is not reached (0)
{
   carac = fgetc(sp);              //read character from file to char
   Data_in[i] = carac;             // put char in its corresponding position
}
//

fclose(sp);                        //close file

问题是在Windows XP中有一个由记事本制作的文本文件。
在它里面，我有这4个字符串":\n\nC"（冒号，Enter键，Enter键，大写C）。

HxD（十六进制编辑器）：3A 0D 0A 0D 0A 43的外观如下。

该表使其更清晰：

character             hex      decimal    binary
 :                    3A       58         0011 1010
 \n (enter+newline)   0D 0A    13 10      0000 1101 0000 1010
 \n (enter+newline)   0D 0A    13 10      0000 1101 0000 1010
 C                    43       67         0100 0011

现在，我执行该程序，该程序以二进制形式打印该部分，因此得到：

character      hex      decimal      binary
 :             3A         58         0011 1010
 (newline)     0A         10         0000 1010
 (newline)     0A         10         0000 1010
 C             43         67         0100 0011

好了，既然已经显示出来，我会提出以下问题：

读数正确吗？
如果是这样，为什么要取出0D？
这是如何运作的？

最佳答案

使fopen二进制：

fopen("file.txt", "rb");
                    ^

否则，您的标准库只会吃掉\r（0x0D）。

附带说明一下，以二进制模式打开文件还可以缓解另一个问题，即文件中间的某个序列看起来像DOS上的EOF。