本文介绍了如何在Node.js中打开Windows-1255编码的文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Windows-1255(希伯来语)编码的文件,我希望能够在Node.js中访问它.

I have a file in Windows-1255 (Hebrew) encoding, and i'd like to be able to access it in Node.js.

我尝试使用 fs.readFile 打开文件,它为我提供了一个我无法执行的 Buffer .我尝试将编码设置为 Windows-1255 ,但是无法识别.

I tried opening the file with fs.readFile, and it gives me a Buffer that i can't do anything with. I tried setting the encoding to Windows-1255, but that wasn't recognized.

我还签出了 windows-1255 ,但我无法对此进行解码,因为 fs.readFile 给出了 Buffer UTF8 字符串,并且程序包需要1255-编码的字符串.

I also checked out the windows-1255 package, but i couldn't decode with that, because fs.readFile either gives a Buffer or a UTF8 string, and the package requires a 1255-encoded string.

如何在Node.js中读取 Windows-1255 编码的文件?

How can i read a Windows-1255-encoded file in Node.js?

推荐答案

似乎使用 node-iconv包是最好的方法.不幸的是,更容易包含在您的代码中的 iconv-lite 似乎并未实现CP1255的转码.

It seems that using the node-iconv package is the best way. Unfortunately iconv-lite which is easier to include in your code does not seem to implement transcoding for CP1255.

此线程&答案显示了简单的示例,并简要演示了如何使用这两个模块.

This thread & answer shows simple example and concisely demonstrates using both these modules.

回到iconv,我在使用npm前缀的debian上安装时遇到了一些问题,我向维护者提交了一个问题此处.我设法解决了安装时出现的问题,而"sudo chown"使我回到了已安装的模块.

Returning to iconv, I've had some problems installing on debian with npm prefix, and I submitted an issue to the maintainer here. I managed to workaround the issue sudo-ing the install, and the "sudo chown"-ing back to me the installed module.

我已经测试了各种win-xxxx编码和可以访问的(西方和东欧样本)的CodePage.

I have tested various win-xxxx encodings and CodePages that have access to (Western+Eastern European samples).

但是尽管CP1255以其受支持的编码列出,但我无法使其与CP1255一起使用,因为我没有在本地安装该特定的代码页,并且全部弄乱了.我尝试从此页面窃取一些希伯来语脚本,但是粘贴的版本始终被损坏.我不敢在Windows机器上实际安装该语言,以免我不愿意使用它-抱歉.

But I could not make it work with CP1255 although it is listed in their supported encodings, because I do not have that specific codepage installed locally, and it gets all mangled up. I tried stealing some Hebrew script from this page, but the pasted version was always corrupted. I dared not actually install the language on my Windows machine for fear I don't brick it - sorry.

// sample.js
var Iconv = require('iconv').Iconv;
var fs = require('fs');

function decode(content) {
  var iconv = new Iconv('CP1255', 'UTF-8//TRANSLIT//IGNORE');
  var buffer = iconv.convert(content);
  return buffer.toString('utf8');
};

console.log(decode(fs.readFileSync('sample.txt')));


关于文件编码以及如何通过Node.js缓冲区读取文件的额外(非主题)说明:


Extra (off topic) explanations for dealing with file encodings, and how to read files through Node.js buffers:

fs. readFile 返回 缓冲区 默认为

fs.readFile returns a buffer by default.

// force the data to be string with the second optional argument
fs.readFile(file, {encoding:'utf8'}, function(error, string) {
    console.log('raw string:', string);// autoconvert to a native string
});

OR

// use the raw return buffer and do bitwise processing on the encoded bytestream
fs.readFile(file, function(error, buffer) {
    console.log(buffer.toString('utf8'));// process the binary buffer
});

这篇关于如何在Node.js中打开Windows-1255编码的文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-16 16:20