问题描述
我需要将文本文件转换为String,最后,我应该将它作为输入参数(InputStream)输入到IFile.create(Eclipse)中。
寻找这个例子或怎么做,但还是无法弄清楚...需要你的帮助!
I need to convert text file to the String, which, finally, I should put as an input parameter (type InputStream) to IFile.create (Eclipse).Looking for the example or how to do that but still can not figure out...need your help!
只是为了测试,我试过转换原始文本文件以UTF-8编码此代码
just for testing, I did try to convert original text file to UTF-8 encoded with this code
FileInputStream fis = new FileInputStream(FilePath);
InputStreamReader isr = new InputStreamReader(fis);
Reader in = new BufferedReader(isr);
StringBuffer buffer = new StringBuffer();
int ch;
while ((ch = in.read()) > -1) {
buffer.append((char)ch);
}
in.close();
FileOutputStream fos = new FileOutputStream(FilePath+".test.txt");
Writer out = new OutputStreamWriter(fos, "UTF8");
out.write(buffer.toString());
out.close();
但即使以为最终的* .test.txt文件有UTF-8编码,里面的字符是损坏。
but even thought the final *.test.txt file has UTF-8 encoding, the characters inside are corrupted.
推荐答案
您需要用 InputStreamReader
code> Charset 参数。
You need to specify the encoding of the InputStreamReader
using the Charset
parameter.
// ↓ whatever the input's encoding is
Charset inputCharset = Charset.forName("ISO-8859-1");
InputStreamReader isr = new InputStreamReader(fis, inputCharset));
这也可以:
InputStreamReader isr = new InputStreamReader(fis, "ISO-8859-1"));
另见:
See also:
- ,免费的java代码页检测
- (Mozilla字符集检测器的Java端口)
InputStreamReader(InputStream in, Charset cs)
Charset.forName(String charsetName)
- Java: How to determine the correct charset encoding of a stream
- How to reliably guess the encoding between MacRoman, CP1252, Latin1, UTF-8, and ASCII
- GuessEncoding - only works for UTF-8, UTF-16LE, UTF-16BE, and UTF-32 ☹
- ICU Charset Detector
- cpdetector, free java codepage detection
- JCharDet (Java port of Mozilla charset detector)
SO搜索我发现的地方所有这些链接:
SO search where I found all these links: https://stackoverflow.com/search?q=java+detect+encoding
您可以获取默认的字符集 - 这是来自运行JVM的系统 - t运行时通过 Charset.defaultCharset()
。
You can get the default charset - which is comes from the system the JVM is running on - at runtime via Charset.defaultCharset()
.
这篇关于将已知编码的文件转换为UTF-8的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!