将大数组写入文本文件

将大数组写入文本文件

本文介绍了Python:将大数组写入文本文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 Python 新手,我有一个解决方案,但它看起来又慢又傻,所以我想知道是否有更好的方法?

I'm new to Python and I have a solution for this but it seems slow and silly, so I wondered if there is a better way?

假设我有一个这样定义的矩阵:

Say I have a matrix defined like this:

mat = [['hello']*4 for x in xrange(3)]

我正在使用此函数将其写入文件:

I am using this function to write it to file:

def writeMat(mat, outfile):
  with open(outfile, "w") as f:
    for item in mat:
      f.writelines(str(item).replace('[','').replace(',','').replace('\'','').replace(']','\n'))

writeMat(mat, "temp.txt")

它给出了一个看起来像这样的文本文件:

which gives a text file that looks like:

hello hello hello hello
hello hello hello hello
hello hello hello hello

我正在处理的文件非常大.numpy 中的 savetxt 函数会很棒,但我不想将它存储为 numpy 数组,因为虽然矩阵的大部分由单个字符元素组成,但前几列会很多字符长度,在我看来(如果我错了,请纠正我)这意味着整个矩阵将使用比所需更多的内存,因为矩阵中的每个元素都将是最大元素的大小.

The files that I am dealing with are very large. The savetxt function in numpy would be great, but I don't want to store this as a numpy array because while the majority of the matrix is comprised of single character elements, the first few columns will be many characters in length, and it seems to me (correct me if I am wrong) this would mean the whole matrix would use much more memory than is necessary because every element in the matrix will be the size of the largest element.

推荐答案

如果我正确理解你的问题,你可以这样做:

If I understand your question correctly, you could do:

f.writelines(' '.join(row) + '\n' for row in mat)

f.write('\n'.join(' '.join(row) for row in mat))

第一个的优点是它是一个生成器表达式,它只生成当前行的连接字符串副本

The first one has the advantage of being a generator expression that only makes a concatenated string copy of the currentline

如果你的矩阵条目不是字符串,你可以这样做:

And if your matrix entries are not strings, you could do:

f.writelines(' '.join(str(elem) for elem in row) + '\n' for row in mat)

编辑

看起来 file.writelines() 方法在将整个生成器表达式写入文件之前会对其求值.因此,以下操作将最大限度地减少您的内存消耗:

It appears that the file.writelines() method evaluates the entire generator expression before writing it to the file. So the following would minimize your memory consumption:

for row in mat:
    f.write(' '.join(row) + '\n')

这篇关于Python:将大数组写入文本文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-25 20:42