问题描述
我有一个只有这两行的ruby文件:
I have a ruby file with only these two lines:
# encoding: utf-8
puts "—"
当我使用 ruby test_enc.rb
它失败:
test_enc.rb:2: invalid multibyte char (UTF-8)
test_enc.rb:2: unterminated string meets end of file
我不知道如何正确指定字符代码 -
(emdash),但vim告诉我它是 151,Hex 97,八进制227
。它与其他字符,如ã
也一样失败,所以我怀疑它与那个字符有关。
我在Windows XP上运行,我使用的ruby版本是:
I don't know how to properly specify the character code of —
(emdash), but vim tells me it is 151, Hex 97, Octal 227
. It fails the same way with other characters like ã
as well, so I doubt it is related specifically to that character.I am running on Windows XP and the version of ruby I'm using is:
ruby 1.9.1p430 (2010-08-16 revision 28998) [i386-mingw32]
我觉得有一些很明显我在这里失踪。任何想法?
I feel like there is something very obvious I am missing here. Any ideas?
编辑:今天学习了一个有关假设的有价值的教训 - 特别是假设您的编辑器使用UTF-8而没有实际检查。糟糕!
Learned a valuable lesson about assumptions today - specifically assuming your editor IS using UTF-8 without actually checking it. Oops!
感谢您快速准确地回覆所有邮件!
Thanks for the quick and accurate replies all!
重新编辑'为utf-8正确设置vim'变得太大了,并且与这个问题没有真正的关系,所以现在是一个。
EDIT AGAIN: The 'setting up vim properly for utf-8' grew too big and wasn't really relevant to this question, so it is now a separate question.
推荐答案
鉴于Ruby明确地呼吁你注意UTF-8,我强烈怀疑你没有实际写出一个UTF-8文件开始。确保Vim(或用于创建文件的任何文本编辑器)真的设置为写UTF-8。
Given that Ruby is explicitly calling your attention to UTF-8, I strongly suspect that you haven't actually written out a UTF-8 file to start with. Make sure that Vim (or whatever text editor you're using to create the file) is really set to write out UTF-8.
请注意,在UTF-8中,任何非ASCII字符将由多个字节表示,而不是如Vim诊断中所述的单个字节。我建议使用二进制文件编辑器(或转储,或任何)来真正显示文本文件中的内容。有些东西还没有编码的一些先入为主的概念,甚至连一个文本文件都没有。
Note that in UTF-8, any non-ASCII character will be represented by multiple bytes, not a single byte as you've described from the Vim diagnostics. I'd recommend using a binary file editor (or dump, or whatever) to really show what's in the text file though. Something that doesn't already have some preconceived notion of the encoding - something that isn't even trying to think of it as a text file.
记事本让你写出来一个UTF-8文件,所以你可能想试试看看会发生什么。 (我没有自己安装Ruby,否则我会为你试试。)
Notepad lets you write out a file in UTF-8, so you might want to try that just to see what happens. (I don't have Ruby installed myself, otherwise I'd try it for you.)
这篇关于Ruby 1.9 - 多字节字符无效(utf-8)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!