问题描述
您好,
我正在寻找一个可以转换不同
编码字符的程序(例如EUC-JP,Big5,GB-18030等)。 )进入HTML&符号
转义序列。有谁知道我在哪里找到一个?
thx。
Hello,
I''m looking for a program that converts characters of different
encodings (such as EUC-JP, Big5, GB-18030, etc.) into HTML ampersand
escape sequences. Anybody knows where I can find one?
thx.
推荐答案
IIRC Tidy会这样做。
-
David Dorward< http://blog.dorward.me.uk/> < http://dorward.me.uk/>
Home是〜/ .bashrc的地方
IIRC Tidy will do that.
http://tidy.sf.net/
--
David Dorward <http://blog.dorward.me.uk/> <http://dorward.me.uk/>
Home is where the ~/.bashrc is
IIRC Tidy会这样做。
IIRC Tidy will do that.
嗯,是的,但仅限于角色它支持的编码(并且它不支持
支持SwordAngel列出的任何编码)。 Windows
用户可以使用实验性功能编译Tidy,支持
所有字符编码Windows / Internet Explorer支持通过
TIDY_WIN32_MLANG_SUPPORT #define ,但通常最好使用ex /
ternal工具,如iconv,piconv,uconv,recode,...将
文档转换为UTF-8和让Tidy相应处理文件。
-
Bj?rn H?hrmann·mailto:bj **** @ hoehrmann.de·
Weinh。海峡。 22·Telefon:+49(0)621/4309674·
68309曼海姆·PGP Pub。 KeyID:0xA4357E78·
Well, yes, but only for character encodings it supports (and it does not
support any of the encodings SwordAngel listed to that extend). Windows
users can compile Tidy with an experimental feature that enables support
for all character encodings Windows / Internet Explorer support via the
TIDY_WIN32_MLANG_SUPPORT #define, but it is generally better to use ex-
ternal tools such as iconv, piconv, uconv, recode, ... to convert the
document to UTF-8 and let Tidy process the document accordingly.
--
Bj?rn H?hrmann · mailto:bj****@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
" free recode" ?
用类似的方式调用它:
recode -d euc-jp..h4< input.html> output.html
但是,不要整理HTML,但不像Tidy ;-)
并且不要不要忘记,当你将特定于语言的
编码转换成汉字统一的Unicode字符时,你应该用正确的语言属性标记
源代码。为了获得统一字符的正确渲染,需要获得
。至少那是我的理解(我实际上不能自己阅读)。
"free recode" ? http://recode.progiciels-bpi.ca/
Call it with something like:
recode -d euc-jp..h4 < input.html > output.html
That won''t do anything to tidy up the HTML, though, unlike Tidy ;-)
And don''t forget that when you''ve translated language-specific
encodings into Han-unified Unicode characters, you should mark-up
the source with the correct language attribute in order to get
the right rendering of the unified characters. At least that''s my
understanding (I can''t actually read them myself).
这篇关于字符到HTML&符号转义序列转换器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!