问题描述
我正在尝试在 Python 中解析一个制表符分隔的文件,其中一个数字与行的开头分开放置 k 个制表符,应该放入第 k 个数组中.
I'm trying to parse a tab-separated file in Python where a number placed k tabs apart from the beginning of a row, should be placed into the k-th array.
除了逐行读取并执行幼稚解决方案会执行的所有明显处理之外,是否有内置函数或更好的方法来执行此操作?
Is there a built-in function to do this, or a better way, other than reading line by line and do all the obvious processing a naive solution would perform?
推荐答案
您可以使用csv
模块 轻松解析制表符分隔值文件.
You can use the csv
module to parse tab seperated value files easily.
import csv
with open("tab-separated-values") as tsv:
for line in csv.reader(tsv, dialect="excel-tab"): #You can also use delimiter=" " rather than giving a dialect.
...
其中 line
是每次迭代的当前行上的值的列表.
Where line
is a list of the values on the current row for each iteration.
正如下面所建议的,如果你想按列而不是按行阅读,那么最好的办法是使用 zip()
内置:
As suggested below, if you want to read by column, and not by row, then the best thing to do is use the zip()
builtin:
with open("tab-separated-values") as tsv:
for column in zip(*[line for line in csv.reader(tsv, dialect="excel-tab")]):
...
这篇关于在 Python 中解析制表符分隔的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!