Windows 上带有 utf8 编码文件的 node.js 读取文件错误

本文介绍了Windows 上带有 utf8 编码文件的 node.js 读取文件错误的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试在 Windows 8.1 上使用 node.js (0.10.29) 从磁盘加载 UTF8 json 文件.以下是运行的代码:

I'm trying to load a UTF8 json file from disk using node.js (0.10.29) on Windows 8.1. The following is the code that runs:

var http = require('http');
var utils = require('util');
var path = require('path');
var fs = require('fs');

var myconfig;
fs.readFile('./myconfig.json', 'utf8', function (err, data) {
    if (err) {
        console.log("ERROR: Configuration load - " + err);
        throw err;
    } else {
        try {
            myconfig = JSON.parse(data);
            console.log("Configuration loaded successfully");
        }
        catch (ex) {
            console.log("ERROR: Configuration parse - " + err);
        }


    }
});

运行时出现以下错误:

SyntaxError: Unexpected token ´╗┐
    at Object.parse (native)
    ...

现在，当我将文件编码(使用 Notepad++)更改为 ANSI 时，它可以正常工作.

Now, when I change the file encoding (using Notepad++) to ANSI, it works without a problem.

任何想法为什么会这样?在 Windows 上进行开发时，最终解决方案将部署到各种非 Windows 服务器，例如，如果我将 ANSI 文件部署到 Linux，我担心会在服务器端遇到问题.

Any ideas why this is the case? Whilst development is being done on Windows the final solution will be deployed to a variety of non-Windows servers, I'm worried that I'll run into issues on the server end if I deploy an ANSI file to Linux, for example.

根据我在这里和通过谷歌的搜索，代码应该可以在 Windows 上运行，因为我特别告诉它需要一个 UTF-8 文件.

According to my searches here and via Google the code should work on Windows as I am specifically telling it to expect a UTF-8 file.

我正在阅读的示例配置:

Sample config I am reading:

{
    "ListenIP4": "10.10.1.1",
    "ListenPort": 8080
}

推荐答案

Per "fs.readFileSync(filename, 'utf8') 不会去除 BOM 标记 #1918"，fs.readFile 按设计工作:BOM 未从 UTF 的标头中去除-8 文件(如果存在). 由开发人员自行决定处理.

Per "fs.readFileSync(filename, 'utf8') doesn't strip BOM markers #1918", fs.readFile is working as designed: BOM is not stripped from the header of the UTF-8 file, if it exists. It at the discretion of the developer to handle this.

可能的解决方法:

data = data.replace(/^uFEFF/, ''); 根据 https://github.com/joyent/node/issues/1918#issuecomment-2480359
转换传入流以使用 NPM 模块 bomstrip 删除 BOM 标头，每个 https://github.com/joyent/node/issues/1918#issuecomment-38491548

您得到的是 UTF-8 文件的字节顺序标记标头 (BOM).当 JSON.parse 看到这一点时，它会给出一个语法错误(阅读:意外字符"错误).在将文件传递给 JSON.parse 之前，您必须从文件中去除字节顺序标记:

What you are getting is the byte order mark header (BOM) of the UTF-8 file. When JSON.parse sees this, it gives an syntax error (read: "unexpected character" error). You must strip the byte order mark from the file before passing it to JSON.parse:

fs.readFile('./myconfig.json', 'utf8', function (err, data) {
    myconfig = JSON.parse(data.toString('utf8').replace(/^uFEFF/, ''));
});
// note: data is an instance of Buffer

这篇关于Windows 上带有 utf8 编码文件的 node.js 读取文件错误的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！

MARK