本文介绍了如何将 pandas 数据帧一行一行地写入CSV文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有大约一百万个地址的列表,以及一个查找其纬度和经度的函数.由于某些记录的格式不正确(或出于任何原因),因此有时该函数无法返回某些地址的纬度和经度.这将导致for循环中断.因此,对于成功检索到纬度和经度的每个地址,我想将其写入输出CSV文件.或者,也许不是逐行写入,而是以小块大小写入也可以.为此,我在追加"模式(mode='a')中使用df.to_csv,如下所示:

I have a list of about 1 million addresses, and a function to find their latitudes and longitudes. Since some of the records are improperly formatted (or for whatever reason), sometimes the function is not able to return the latitudes and longitudes of some addresses. This would lead to the for loop breaking. So, for each address whose latitude and longitude is successfully retrieved, I want to write it to the output CSV file. Or, perhaps instead of writing line by line, writing in small chunk sizes would also work. For this, I am using df.to_csv in "append" mode (mode='a') as shown below:

for i in range(len(df)):
    place = df['ADDRESS'][i]
    try:
        lat, lon, res = gmaps_geoencoder(place)
    except:
        pass

    df['Lat'][i] = lat
    df['Lon'][i] = lon
    df['Result'][i] = res

    df.to_csv(output_csv_file,
          index=False,
          header=False,
          mode='a', #append data to csv file
          chunksize=chunksize) #size of data to append for each loop

但是问题是,它正在为每个追加打印整个数据帧.因此,对于n行,它将写入整个数据帧n^2次.该如何解决?

But the problem with this is that, it is printing the whole dataframe for each append. So, for n lines, it would write the whole dataframe n^2 times. How to fix this?

推荐答案

如果您确实要逐行打印. (您不应该).

If you really want to print line by line. (You should not).

for i in range(len(df)):
    df.loc[[i]].to_csv(output_csv_file,
        index=False,
        header=False,
        mode='a')

这篇关于如何将 pandas 数据帧一行一行地写入CSV文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-06 14:54