本文介绍了如何将异构数据(np.genfromtxt)加载为2D数组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我从如果数据不一致,numpy.genfromtxt
返回一个结构化的 ndarray. 如何将异构数据作为2D数组加载?
I learn from numpy.genfromtxt produces array of what looks like tuples, not a 2D array—why? that numpy.genfromtxt
returns a structured ndarray if the data is not homogeneous. How do I load heterogeneous data as a 2D array?
例如,一个文本文件,其内容为:(标题以外的所有项目均为int
)
For instance, a text file whose contents are: (all items except the header are int
)
# c1 c2 c3 c4 c5
3 4 8 6 8
10 7 6 7 10
5 10 2 1 3
7 6 5 3 6
5 8 5 2 7
1 2 2 10 8
10 5 9 3 8
5 2 4 4 2
使用np.genfromtxt
加载数据,
# load data from a text file
table = np.genfromtxt('table.dat', dtype=int, delimiter='\t', names=True, filling_values=0)
print(table.shape)
print(table)
# output
(8,)
[(3, 4, 8, 6, 8) (10, 7, 6, 7, 10) (5, 10, 2, 1, 3) (7, 6, 5, 3, 6)
(5, 8, 5, 2, 7) (1, 2, 2, 10, 8) (10, 5, 9, 3, 8) (5, 2, 4, 4, 2)]
# expecting result
(8, 5)
[[ 7 2 4 9 2]
[ 5 8 1 6 4]
[ 6 3 1 4 10]
[10 10 6 5 5]
[10 4 7 7 1]
[ 1 9 8 6 2]
[ 3 2 3 4 4]
[ 7 5 9 10 6]]
PS:我想保留header = table.dtype.names
用于其他目的.
PS: I wanna keep header = table.dtype.names
for other purpose.
推荐答案
在这种情况下,使用pandas,然后将pandas数据帧转换为numpy矩阵会更容易.
In this case use pandas and then converting pandas dataframe to numpy matrix would be easier.
import pandas as pd
foo = pd.read_csv('table.dat', sep='\t')
type(foo)
<class 'pandas.core.frame.DataFrame'>
bar = foo.as_matrix()
array([[10, 7, 6, 7, 10],
[ 5, 10, 2, 1, 3],
[ 7, 6, 5, 3, 6],
[ 5, 8, 5, 2, 7],
[ 1, 2, 2, 10, 8],
[10, 5, 9, 3, 8],
[ 5, 2, 4, 4, 2]])
bar.shape
(7,5)
这篇关于如何将异构数据(np.genfromtxt)加载为2D数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!