尽管存在UnicodeDecodeError，Python 3 itertools.islice仍继续

本文介绍了尽管存在UnicodeDecodeError，Python 3 itertools.islice仍继续的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个监视日志文件的python 3程序.该日志除其他外包括用户编写的聊天消息.该日志是由我无法更改的第三方应用程序创建的.

I have a python 3 program that monitors a log file. The log includes, among other things, chat messages written by users. The log is created by a third party application which I cannot change.

今天，用户写了텋 텋 "，它导致程序崩溃，并出现以下错误:

Today a user wrote "텋��텋��" and it caused the program to crash with the following error:

future: <Task finished coro=<updateConsoleLog() done, defined at /usr/local/src/bserver/logmonitor.py:48> exception=UnicodeDecodeError('utf-8',...
say "\xed\xa0\xbd\xed\xb1\x8c"\r\n', 7623, 7624, 'invalid continuation byte')>
Traceback (most recent call last):
File "/usr/lib/python3.4/asyncio/tasks.py", line 238, in _step
result = next(coro)
File "/usr/local/src/bserver/logmonitor.py", line 50, in updateConsoleLog
server_events = self.console.getUpdate()
File "/usr/local/src/bserver/console.py", line 79, in getUpdate
return self.read()
File "/usr/local/src/bserver/console.py", line 90, in read
for line in itertools.islice(log_file, log_no, None):
File "/usr/lib/python3.4/codecs.py", line 319, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 7623: invalid continuation byte
ERROR:asyncio:Task exception was never retrieved

使用'file -i log.file'，我确定该日志文件是us-ascii.这不应该是问题，因为ascii是utf-8的子集(据我所知).

Using 'file -i log.file' I determined that the log file is us-ascii. This shouldn't be and issue as ascii is a subset of utf-8 (as far as I know).

由于这种情况很少发生，而且我不介意丢失此用户键入的内容，因此我有可能忽略此行或无法解码的特定字符，而继续阅读其余内容吗?文件?

Since this happens rarely and I don't mind losing whatever it is that this user typed, is it possible for me to ignore this line or the particular characters that can't be decoded and just keep on reading the rest of the file?

我考虑使用try: ... except UnicodeDecodeError as ...，但这意味着错误发生后我无法读取日志文件中的任何内容.

I considered using try: ... except UnicodeDecodeError as ..., but this would mean I can't read anything in the log file after the error.

代码

def read(self):
    log_no = self.last_log_no
    log_file = open(self.path, 'r')
    server_events = []
    starting_log_no = log_no
    for line in itertools.islice(log_file, log_no, None): //ERROR
        server_events.append(line)
        print(line.replace('\n', '').replace('\r', ''))

        log_no += 1
        self.last_log_no = log_no
    if (starting_log_no < log_no):
        return server_events
    return False

任何帮助或建议将不胜感激！

Any help or advise would be appreciated!

bits

尽管存在UnicodeDecodeError，Python 3 itertools.islice仍继续

问题描述

推荐答案