Storage加载CSV文件时出现BigQuery错误

Storage加载CSV文件时出现BigQuery错误

本文介绍了从Google Cloud Storage加载CSV文件时出现BigQuery错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将保存在GCS中的csv文件的数据加载到BigQuery中. csv文件为UTF-8格式,包含7列.我已经在数据方案中指定了这些列(所有字符串和可空值),并检查了csv文件的内容,看起来不错.

I'm trying to load the data of a csv file that is saved in GCS into BigQuery. The csv file is in the UTF-8 format and it contains 7 columns. I've specified these columns in the data scheme (all strings and nullable) and I've checked the contents of the csv file which seems fine.

当我尝试加载数据时,出现以下错误:

When I try to load the data I get the following error:

奇怪的是,该文件仅包含680228行.

The weird thing is that the file only contains 680228 rows.

当我检查allow jagged lines选项时,正在生成该表,但是只有第一列填充了整个逗号分隔的字符串.

When I check the allow jagged lines options the table is being generated, but only the first column is filled with the entire comma separated string.

有人可以帮助我吗?

示例行

推荐答案

对我来说,这是一个新行和回车符的问题,请尝试替换特殊字符.我已经使用下面的代码替换了字符,并解决了加载部分.

For me, it was an issue with the presence of new line and carriage return characters, try replacing the special characters. I have replaced the characters using below code and it resolved the loading part.

df= df.applymap(lambda x: x.replace("\r"," "))
df= df.applymap(lambda x: x.replace("\n"," "))

我使用了lambda函数,因为我不知道在我的情况下哪一列是字符串.如果您对色谱柱有把握,请明智地更换其色谱柱.

I have used lambda function as I don't know which column is string in my case. If you are sure about columns then replace its column wise.

尝试替换字符,它也将对您有用.

Try to replace the characters and it will work for you as well.

这篇关于从Google Cloud Storage加载CSV文件时出现BigQuery错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-29 11:51