问题描述
遵循我的一个旧的。我终于确定会发生什么情况。
Following an old question of mine. I finally identified what happens.
我有一个csv文件,其中包含了Sperator \t
并进行读取使用以下命令:
I have a csv-file which has the sperator \t
and reading it with the following command:
df = pd.read_csv(r'C:\..\file.csv', sep='\t', encoding='unicode_escape')
例如,长度为:800.000
the length for example is: 800.000
问题是原始文件大约有1.400.000行,而且我也知道问题出在哪里,其中一列(比方说columnA)具有以下条目:
The problem is the original file has around 1.400.000 lines, and I also know where the issue occures, one column (let's say columnA) has the following entry:
"HILFE FüR DIE Alten
您知道发生了什么吗?删除该行时,我得到正确的行数(长度),python在这里做什么?
Do you have any idea what is happening? When I delete that row I get the correct number of lines (length), what is python doing here?
推荐答案
根据熊猫文档
这可能是带有双引号符号的问题。
尝试以下操作:
It may be issue with double quotes symbol.Try this instead:
df = pd.read_csv(r'C:\..\file.csv', sep='\\t', encoding='unicode_escape', engine='python')
或此:
df = pd.read_csv(r'C:\..\file.csv', sep=r'\t', encoding='unicode_escape')
这篇关于 pandas 读csv跳过了几行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!