问题描述
我希望使用 R 将以 ANSI 编码的 HTML 文件转换为 UTF-8.
I wish to convert an HTML file encoded in ANSI to UTF-8, using R.
是否有工具或工具组合可以使这项工作发挥作用?
Is there a tool, or a combination of tools, that can make this work?
谢谢.
编辑:好的,我已将问题缩小到另一个问题.它重新发布在这里:使用猫"将非英文字符写入 .html 文件(在 R 中)
Edit: o.k, I've narrowed my problem to another one. It is re-posted here: Using "cat" to write non-English characters into a .html file (in R)
推荐答案
你可以使用 iconv:
you can use iconv:
writeLines(iconv(readLines("tmp.html"), from = "ANSI_X3.4-1986", to = "UTF8"), "tmp2.html")
tmp2.html 应该是 utf-8.
tmp2.html should be utf-8.
由 Henrik 于 2015 年 6 月
从评论中提取的 Windows 工作解决方案如下:
Edit by Henrik in June 2015:
A working solution for Windows distilled from the comments is as follows:
writeLines(iconv(readLines("tmp.html"), from = "ANSI_X3.4-1986", to = "UTF8"),
file("tmp2.html", encoding="UTF-8"))
更新 2021:如果 ANSI 是当前语言环境,则以下方法也适用(即,使用本地编码作为 from
源):
Update 2021: And if ANSI is the current locale, the following works as well (i.e., uses the local encoding as from
source):
writeLines(iconv(readLines("tmp.html"), from = "", to = "UTF8"),
file("tmp2.html", encoding="UTF-8"))
这篇关于使用 R 转换文件编码?(ANSI 到 UTF-8)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!