本文介绍了numpy loadtxt编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用numpy.loadtxt加载数据...我试图读取的文件正在使用cp1252编码.是否可以使用numpy将编码更改为cp1252?

I am trying to load data with numpy.loadtxt... The file im trying to read is using cp1252 coding. Is there a possibility to change the encoding to cp1252 with numpy?

以下

import numpy as np
n = 10
myfile = '/path/to/myfile'
mydata = np.loadtxt(myfile, skiprows = n)

给予:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf6 in position 189: invalid start byte

该文件包含元数据(前n行),后跟一个浮点表.

The file contains metadata (first n rows) followed by a table of floats.

仅当在Ubuntu(12.04)上运行此问题时,才会发生此问题.在Windows上运行良好.因此,我认为这个问题与编码有关.

This problem only occurs when running this on Ubuntu (12.04). On Windows it works well. For this reason I think this problem is related to the encoding.

Edit2:也可以如下所示打开文件:

opening the file as shown in the following works well, too:

import codecs
data = codecs.open(myfile, encoding='cp1252')
datalines = data.readlines()

但是我想使用np.loadtext直接将数据读取到numpy数组中.

However I'd like to use np.loadtext to directly read the data into a numpy array.

推荐答案

我可以自己解决问题.

在用numpy读取文件之前,我只需要用适当的文件打开文件即可:

I just had to open the file with the appropriate before reading it with numpy:

import numpy as np
import codecs

n=10

filecp = codecs.open(myfile, encoding = 'cp1252')
mydata = np.loadtxt(filecp, skiprows = n)

谢谢大家!

这篇关于numpy loadtxt编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 11:11