下面是一个带有一些虚拟标题的 csv 片段,而实际帧由 beerId anchor 定:

This work is an unpublished, copyrighted work and contains confidential information.
beer consumption
consumptiondate 7/24/2018
consumptionlab  H1
numbeerssuccessful  40
numbeersfailed  0
totalnumbeers   40
consumptioncomplete TRUE

beerId  Book
341027  Northern Light

这个 df = pd.read_csv(path_csv, header=8) 代码有效,但问题是标题并不总是在 8 中,具体取决于一天。无法弄清楚如何使用 help 中的 lambda


找到 beerId 的索引行

最佳答案

我认为首先需要预处理:

path_csv = 'file.csv'
with open(path_csv) as f:
    lines = f.readlines()
    #get list of all possible lins starting by beerId
    num = [i for i, l in enumerate(lines) if l.startswith("beerId" )]
    #if not found value return 0 else get first value of list subtracted by 1
    num = 0 if len(num) == 0 else num[0] - 1
    print (num)
    8


df = pd.read_csv(path_csv, header=num)
print (df)
             beerId  Book
0  341027  Northern Light

关于python - pandas read_csv skiprows - 确定要跳过的行,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/51530785/

10-14 18:08
查看更多