本文介绍了Python-打开和更改大型文本文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个〜600MB的Roblox类型的.mesh文件,该文件在任何文本编辑器中的读取效果都类似于文本文件.我在下面有以下代码:

I have a ~600MB Roblox type .mesh file, which reads like a text file in any text editor. I have the following code below:

mesh = open("file.mesh", "r").read()
mesh = mesh.replace("[", "{").replace("]", "}").replace("}{", "},{")
mesh = "{"+mesh+"}"
f = open("p2t.txt", "w")
f.write(mesh)

它返回:

Traceback (most recent call last):
  File "C:\TheDirectoryToMyFile\p2t2.py", line 2, in <module>
    mesh = mesh.replace("[", "{").replace("]", "}").replace("}{", "},{")
MemoryError

这是我的文件的一个示例:

Here is a sample of my file:

[-0.00599, 0.001466, 0.006][0.16903, 0.84515, 0.50709][0.00000, 0.00000, 0][-0.00598, 0.001472, 0.00599][0.09943, 0.79220, 0.60211][0.00000, 0.00000, 0]

我该怎么办?

我不确定另一个线程中的head,follow和tail命令是什么,将其标记为重复.我尝试使用它,但无法使其正常工作.该文件也是一大行,它没有分成几行.

I'm not sure what the head, follow, and tail commands are in that other thread that marked this as a duplicate. I tried to use it, but couldn't get it to work. The file is also one giant line, it isn't split into lines.

推荐答案

您需要在每次迭代中读取一个咬合,对其进行分析,然后再写入另一个文件或sys.stdout.尝试以下代码:

You need to read one bite per iteration, analyze it and then write to another file or to sys.stdout. Try this code:

mesh = open("file.mesh", "r")
mesh_out = open("file-1.mesh", "w")

c = mesh.read(1)

if c:
    mesh_out.write("{")
else:
    exit(0)
while True:
    c = mesh.read(1)
    if c == "":
        break

    if c == "[":
        mesh_out.write(",{")
    elif c == "]":
        mesh_out.write("}")
    else:
        mesh_out.write©

UPD:

它的工作速度非常慢(这要归功于jamylak).所以我更改了它:

It works really slow (thanks to jamylak). So I've changed it:

import sys
import re


def process_char(c, stream, is_first=False):
    if c == '':
        return False
    if c == '[':
        stream.write('{' if is_first else ',{')
        return True
    if c == ']':
        stream.write('}')
        return True


def process_file(fname):
    with open(fname, "r") as mesh:
        c = mesh.read(1)
        if c == '':
            return
        sys.stdout.write('{')

        while True:
            c = mesh.read(8192)
            if c == '':
                return

            c = re.sub(r'\[', ',{', c)
            c = re.sub(r'\]', '}', c)
            sys.stdout.write(c)


if __name__ == '__main__':
    process_file(sys.argv[1])

因此,现在在1.4G文件上可以工作约15秒.要运行它:

So now it's working ~15 sec on 1.4G file. To run it:

$ python mesh.py file.mesh > file-1.mesh

这篇关于Python-打开和更改大型文本文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-11 18:50