在Haskell中使用UTF-8作为IO字符串读取文件

本文介绍了在Haskell中使用UTF-8作为IO字符串读取文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有以下代码可以正常工作，除非文件具有 utf-8 个字符:

I have the following code which works fine unless the file has utf-8 characteres :

module Main where
import Ref
main = do
    text <- getLine
    theInput <- readFile text
    writeFile ("a"++text) (unlist . proc . lines $ theInput)

使用utf-8字符，我得到以下信息: hGetContents:无效的参数(无效的字节序列)

With utf-8 characteres I get this:hGetContents: invalid argument (invalid byte sequence)

由于我正在使用的文件具有 UTF-8 字符，因此我想处理此异常，以便在可能的情况下重用从 Ref 导入的功能.

Since the file I'm working with has UTF-8 characters, I would like to handle this exception in order to reuse the functions imported from Ref if possible.

是否可以将 UTF-8 文件读取为 IO String ，以便我可以重用 Ref 的功能?我应该对我的代码进行哪些修改?预先感谢.

Is there a way to read a UTF-8 file as IO String so I can reuse my Ref's functions?. What modifications should I make to my code?. Thanks in Advance.

我从我的 Ref 模块附加函数声明:

I attach the functions declarations from my Ref module:

unlist :: [String] -> String
proc :: [String] -> [String]

从前奏:

lines :: String -> [String]

推荐答案

感谢您的回答，但我自己找到了解决方案.实际上，我正在使用的文件具有以下编码:

Thanks for the answers, but I found the solution by myself.Actually the file I was working with has this codification:

ISO-8859 text, with CR line terminators

因此要使用我的haskell代码处理该文件，应改用以下代码:

So to work with that file with my haskell code It should have this codification instead:

UTF-8 Unicode text, with CR line terminators

您可以使用实用程序 file 来检查文件编码，如下所示:

You can check the file codification with the utility file like this:

$ file filename

要更改文件编码，请遵循此链接！

To change the file codification follow the instructions from this link!

这篇关于在Haskell中使用UTF-8作为IO字符串读取文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！