本文介绍了 pandas 读csv跳过了几行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

遵循我的一个旧的。我终于确定会发生什么情况。

Following an old question of mine. I finally identified what happens.

我有一个csv文件,其中包含了Sperator \t 并进行读取使用以下命令:

I have a csv-file which has the sperator \t and reading it with the following command:

df = pd.read_csv(r'C:\..\file.csv', sep='\t', encoding='unicode_escape')

例如,长度为:800.000

the length for example is: 800.000

问题是原始文件大约有1.400.000行,而且我也知道问题出在哪里,其中一列(比方说columnA)具有以下条目:

The problem is the original file has around 1.400.000 lines, and I also know where the issue occures, one column (let's say columnA) has the following entry:

"HILFE FüR DIE Alten

您知道发生了什么吗?删除该行时,我得到正确的行数(长度),python在这里做什么?

Do you have any idea what is happening? When I delete that row I get the correct number of lines (length), what is python doing here?

推荐答案

根据熊猫文档

这可能是带有双引号符号的问题。
尝试以下操作:

It may be issue with double quotes symbol.Try this instead:

df = pd.read_csv(r'C:\..\file.csv', sep='\\t', encoding='unicode_escape', engine='python')

或此:

df = pd.read_csv(r'C:\..\file.csv', sep=r'\t', encoding='unicode_escape')

这篇关于 pandas 读csv跳过了几行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-18 17:26