问题描述
如何在Scala中将此字符串调查规则
转换为 UTF-8
?
How to convert this String the surveyÂ’s rules
to UTF-8
in Scala?
我试过这些道路但不起作用:
I tried these roads but does not work:
scala> val text = "the surveyÂ’s rules"
text: String = the surveyÂ’s rules
scala> scala.io.Source.fromBytes(text.getBytes(), "UTF-8").mkString
res17: String = the surveyÂ’s rules
scala> new String(text.getBytes(),"UTF8")
res21: String = the surveyÂ’s rules
好的,我已经以这种方式解决了。不是转换而是简单读取
Ok, i'm resolved in this way. Not a converting but a simple reading
implicit val codec = Codec("US-ASCII").onMalformedInput(CodingErrorAction.IGNORE).onUnmappableCharacter(CodingErrorAction.IGNORE)
val src = Source.fromFile(new File (folderDestination + name + ".csv"))
val src2 = Source.fromFile(new File (folderDestination + name + ".csv"))
val reader = CSVReader.open(src.reader())
推荐答案
请注意,当你在没有参数的情况下调用 text.getBytes()
时,你实际上是获取表示平台默认编码中字符串的字节数组。例如,在Windows上,它可能是一些单字节编码;在Linux上它已经是UTF-8了。
Note that when you call text.getBytes()
without arguments, you're in fact getting an array of bytes representing the string in your platform's default encoding. On Windows, for example, it could be some single-byte encoding; on Linux it can be UTF-8 already.
为了正确你需要在中指定精确的编码getBytes()
方法调用。对于Java 7及更高版本,请执行以下操作:
To be correct you need to specify exact encoding in getBytes()
method call. For Java 7 and later do this:
import java.nio.charset.StandardCharsets
val bytes = text.getBytes(StandardCharsets.UTF_8)
对于Java 6,请执行以下操作:
For Java 6 do this:
import java.nio.charset.Charset
val bytes = text.getBytes(Charset.forName("UTF-8"))
然后 bytes
将包含UTF- 8个编码的文本。
Then bytes
will contain UTF-8-encoded text.
这篇关于如何确保字符串是UTF-8?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!